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Chapter  1 
Introduction 


Future  high-speed  broadband  networks  based  on  the  Asynchronous  Transfer  Mode 
(ATM)  technology  will  provide  a  large  variety  of  services  that  cater  to  the  needs  of  dis¬ 
tributed  multimedia  applications  [1,2].  The  term  distributed  multimedia  has  been  used  to 
describe  the  emerging  scenario  in  which  a  single  integrated  network  will  carry  a  wide  va¬ 
riety  of  media  such  as  audio,  video,  image  or  plain  data  associated  with  traffic  classes. 
Not  only  a  broad  range  of  traffic  classes  will  be  carried,  but  also  a  guaranteed  quality-of- 
service  (QoS)  will  be  provided  to  some  of  these  traffic  classes.  The  issue  of  providing 
such  QoS  guarantees  while  taking  advantage  of  the  resource  gains  offered  by  a  statisti¬ 
cally  multiplexed  transport  mechanism  still  remains  as  a  challenging  task  for  network  ar¬ 
chitects.  This  task  is  further  complicated  by  the  fact  that  traffic  generated  by  a  multimedia 
application  may  not  fall  into  a  specific  type  of  traffic  class  supported  by  the  network  due 
to  the  ever-growing  number  of  new  multimedia  applications  with  diverse  media  type  and 
QoS  requirements.  Therefore,  traffic  submitted  to  the  network  must  be  shaped  in  order  to 
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receive  maximum  performance  in  terms  of  network  QoS.  For  users1  (network  clients), 
traffic  shaping  allows  for  better  utilization  of  available  transport  services  and  in  general, 
it  costs  less  because  of  the  possible  reduction  in  bandwidth  (or  allocated  network  re¬ 
source)  requirements.  On  the  other  hand,  more  connections  can  be  admitted  by  the  net¬ 
work  for  a  given  QoS  due  to  higher  network  utilization.  However,  QoS  requirements  for 
applications  are  typically  end-to-end  requirements,  which  impose  corresponding  per¬ 
formance  requirements  on  both  the  network  and  the  end-systems  (hosts).  Thus,  applica¬ 
tions  can  impose  a  set  of  constraints  on  the  traffic  shaping  function.  In  this  report,  such 
traffic  shaping  algorithms  for  a  given  set  of  constraints  are  introduced  and  the  effect  of 
traffic  shaping  on  bandwidth  allocation  and  network  performance  guarantee  is  investi¬ 
gated.  In  the  next  section,  traffic  characteristics  of  multimedia  applications  and  their  cor¬ 
responding  communication  requirements  are  summarized.  Then  a  short  description  of 
MPEG  video  standard  is  provided,  since  the  bit  streams  used  for  the  experiments  in  this 
report  are  based  on  MPEG  encoded  video  sequences.  A  summary  of  ATM  network  serv¬ 
ices  is  introduced  along  with  new  services  proposed  in  the  literature  but  not  yet  stan¬ 
dardized.  Next,  a  discussion  on  how  traffic  shaping  allows  applications  to  efficiently 
utilize  the  network  transport  services  is  presented  and  the  motivation  of  the  research  pur¬ 
sued  in  this  report  is  explained.  Finally,  the  chapter  concludes  by  summarizing  the  main 
contributions  of  this  report. 


1  The  terms  application,  user  and  client  will  be  used  interchangeably  through  the  rest  of  the  report. 
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1.1  Characterization  of  Multimedia  Applications 

The  various  forms  of  multimedia  data  can  be  categorized  as  static  and  continuous 
[3].  While  examples  of  static  media  include  plain  data,  raw  ASCII,  numerical  data,  image 
and  vector  graphics,  continuous  media  types  include  compressed  or  uncompressed  audio 
and  video,  animation  and  interactive  data  all  of  which  imply  a  temporal  dimension.  Mul¬ 
timedia  applications  can  be  characterized  by  their  traffic  characteristics  and  the  corre¬ 
sponding  communications  requirements.  An  application’s  traffic  characteristics  can  be 
described  as  one  or  more  sequence  of  packets,  of  arbitrary  length,  generated  at  a  certain 
time  and  destined  for  one  or  more  locations.  Each  packet  has  an  associated  set  of  com¬ 
munications  requirements.  Traffic  characteristics,  together  with  the  corresponding  com¬ 
munications  requirements,  determine  the  network  resources  (bandwidth  and  buffer) 
needed  to  support  this  application. 

1.1.1  Traffic  Characteristics 

The  traffic  characteristics  of  an  application  can  be  formally  specified  by  its  traffic 
generation  process  which  is  basically  a  sequence  of  packets  at  arbitrary  instants,  each 
packet  having  an  arbitrary  length.  If  packet  generation  occurs  at  regular  time  intervals,  it 
is  a  periodic  traffic  pattern.  If  these  packet  lengths  are  fixed  in  size,  it  is  a  constant  bit  rate 
(CBR)  traffic,  otherwise  it  is  a  variable  bit  rate  (VBR)  traffic.  For  example,  uncom¬ 
pressed  audio  and  video  streams  are  typically  CBR  traffic.  On  the  other  hand,  compressed 
video  is  VBR  in  nature  since  the  amount  of  compressed  information  varies  according  to 
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the  content  and  instantaneous  scene  changes.  This  report  focuses  on  VBR  traffic  as  de¬ 
fined  above. 

1.1.2  Communication  Requirements 

Multimedia  communication  requirements  have  been  extensively  covered  in  [4,  5]. 
In  the  following,  QoS  parameters  relevant  to  the  underlying  network  services  are  sum¬ 
marized. 

Bandwidth.  The  required  bandwidth  depends  on  whether  data  is  compressed  or  not  and 
if  compressed,  the  encoding  scheme  especially  for  video  and  audio  applications.  Today’s 
systems  handle  video  data  almost  exclusively  in  compressed  form  in  order  to  reduce 
transmission  bandwidth  and  storage  requirements.  For  a  user-defined  QoS  level,  it  is  pos¬ 
sible  to  use  one  of  the  compression  standards  developed  exclusively  for  audio  and  video. 
Some  audio  and  video  compression  standards  and  their  respective  bandwidth  require¬ 
ments  for  particular  applications  are  given  in  Table  1.1.  It  should  be  noted  that  the  pre¬ 
sented  values  are  for  average  bandwidth  since  bandwidth  can  have  peak  and  average  val¬ 
ues  for  VBR  traffic.  Three  video  compression  standards  have  been  widely  accepted:  In¬ 
ternational  Standards  Organization  (ISO)  Moving  Pictures  Expert  Group  (MPEG)  [6], 
Intel’s  Digital  Video  Interactive  (DVI),  and  International  Telecommunications  Union 
H.261  [7].  Practical  experience  with  DVI  and  MPEG-1  suggests  a  total  of  1.4  Mbps  for 
audio  and  video,  as  it  provides  good  video  quality  and  accommodates  commercial  audio¬ 
visual  equipment.  The  second  phase  of  MPEG- 1  known  as  MPEG-2  aims  to  address  ap- 
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Table  1.1:  Some  audio  and  video  compression  standards  and  their  respective 
bandwidth  requirements  for  a  given  application. 


Standard 

Bandwidth  Requirement 

Applications 

ADPCM  (CCITT  G.723) 

24  Kbps 

internet  packetized  voice  communications 

p-law  compressed  PCM 
(CCITT  G.711) 

64  Kbps 

ISDN  Digital  Telephony  Service 

MPEG-1  Audio 

256  Kbps 

48-kHz-sampled  stereo  CD-quality  audio 

H.120 

1544  Kbps  or  2  Mbps 

videoconferencing 

H.261 

from  64  Kbps  to  2  Mbps 

videoconferencing,  video-phony 

MPEG-1 

<  1.86  Mbps 

CD-ROM,  desktop  video 

MPEG-2  (Low) 

4  Mbps 

CIF,  VHS-quality 

MPEG-2  (Main) 

15  Mbps 

CCIR-601,  studio  TV 

MPEG-2  (High  1440) 

60  Mbps 

4xCCIR-601,  HDTV 

MPEG-2  (High) 

80  Mbps 

production  SMPTE  240M  standard 

MPEG-4 

from  4.8  Kbps  to  64  Kbps 

dial-up  video,  videophone  over  phone  lines 

plications  at  broadcast  TV  sample  rates  for  higher  quality  video  coded  at  around  4  to  15 
Mbps  [8]. 

End-to-end  Delay.  Much  harder  than  the  pure  bandwidth  requirements  are  the  delay 
restrictions  that  multimedia  applications,  in  particular  interactive  distributed  multimedia 
applications,  impose  on  communications.  The  major  components  that  contribute  to  end- 
to-end  delay  are  given  as  follows  [9]: 

•  source  compression  and  packetization  delay. 

•  network  transmission  delay:  including  medium  access  delay  (MAC),  queuing  de¬ 
lay  inside  the  network  and  propagation  delay. 

•  end-system  queuing  and  synchronization  (playout)  delay. 


•  sink  decompression,  depacketization,  and  output  delay. 

Practical  experience  with  multimedia  conferencing  systems  and  ITU  standards  sug¬ 
gests  a  maximum  end-to-end  delay  of  up  to  150  msec  for  interactive  video  applications 
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[10].  Uncompressed  real-time  video  applications  require  a  delay  bound  of  250  msec,  but 
for  compressed  video,  network  transmission  delay  bound  should  be  less  than  250  msec 
because  of  the  encoding  and  decoding  delays  along  with  the  queuing  and  playout  delay 
that  contribute  to  the  total  delay.  A  wide  range  of  maximum  delay  bound  can  be  specified 
because  of  the  diverse  requirements  of  distributed  multimedia  applications.  For  example, 
to  support  network-based  video  games,  a  response  of  50  msec  or  less  is  required  for 
twitch  actions  [1 1].  On  the  other  hand,  for  a  typical  video  playback  system,  initial  set-up 
delay  can  be  in  the  order  of  minutes  since  all  frames  of  the  movie  will  be  displayed  at 
constant  rate  and  that  will  not  affect  the  user’s  perception  of  video  quality. 

Traffic  that  has  an  upper  bound  falls  into  the  class  of  synchronous  communication. 
Most  audio-video  communication  assumes  constant  delay  for  all  packets2  which  is  called 
isochronous  communications.  Traffic  can  be  distinguished  among  the  following  kinds 
[9]: 

•  asynchronous  -  unrestricted  transmission  delay, 

•  synchronous  -  bounded  transmission  delay  for  each  message,  and 

•  isochronous  -  constant  transmission  delay  for  each  message. 

Isochrony  does  not  have  to  be  maintained  across  the  entire  path  from  the  source 
(video  encoder)  to  sink  (display  device),  only  at  the  final  destination.  A  playout  buffer  is 
used  to  recover  isochrony  due  to  the  variable  delay  contributed  by  the  network.  Some 
delay  variance  or  jitter  of  up  to  5  msec  can  be  tolerated  for  practical  purposes,  but  in  this 


2  Here  packet  corresponds  to  a  voice  sample  or  a  video  picture. 
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report,  only  isochronous  VBR  traffic  is  investigated,  so  the  issue  of  delay  variance  or  de¬ 
lay  jiiter  will  not  be  considered  as  a  separate  item. 

Reliability  Two  types  of  network  errors  can  happen:  bit  corruption  due  to  noise  and 
packet  loss  due  to  congestion  in  the  network  [5].  It  is  expected  that  packet  loss  will  be 
more  common  than  bit  error  due  to  reliability  of  fiber  as  a  transmission  medium.  In  gen¬ 
eral,  packet  loss  rate  should  be  less  than  10"3  for  acceptable  video  quality.  Timeliness  is 
another  important  constraint  for  the  correct  delivery  of  a  packet  since  for  time-critical 
data,  packet  retransmission  is  not  suitable  when  round-trip  time  is  less  than  the  maximum 
delay  bound.  Therefore,  forward  error  correction  (FEC)  techniques  must  be  used  to  re¬ 
cover  lost  data.  In  this  report,  it  is  assumed  that  network  provides  high  reliability  such 
that  the  packet  loss  rate  is  guaranteed  to  be  small. 

1.2  MPEG  Video  Standard 

MPEG  has  been  developed  for  storing  video  (and  associated  audio)  on  digital  stor¬ 
age  media,  which  include  CD-ROM,  digital  audio  tapes,  magnetic  disks  and  writable  op¬ 
tical  disks,  as  well  for  delivering  video  over  networks  and  telecommunication  channels. 
Compared  to  other  standards,  such  as  CCITT  H.120  and  H.261,  MPEG  standard  provides 
better  visual  quality  at  higher  rates.  Phase  one  of  MPEG  standard,  known  as  MPEG-1,  is 
not  intended  to  be  broadcast  television  quality  so  MPEG-2  has  been  developed  to  address 
the  compression  of  television  broadcast  signals  at  10-45  Mbps  to  provide  from  VHS- 
quality  to  HDTV-quality  broadcasting.  At  the  frame  level,  both  MPEG-1  and  MPEG-2 
generate  similar  traffic  characteristics,  therefore  only  MPEG-1  standard  will  be  described. 
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The  MPEG  encoder  produces  a  coded  bit  stream  representing  a  sequence  of  en¬ 
coded  pictures  from  uncompressed  video  data  which  is  a  set  of  pictures  displayed  se¬ 
quentially.  The  MPEG  standard  specifies  three  types  of  encoded  pictures:  I  (Intracoded), 
P  (Predicted),  and  B  (Bidirectional).  Two  parameters  define  the  sequence  of  encoded 
pictures:  M,  the  distance  between  I  or  P  pictures,  and  N,  the  distance  between  I  pictures. 
For  example,  if  M  and  N  are  given  as  3  and  15  respectively,  then  the  sequence  of  encoded 
pictures  is 

IBBPBBPBBPBBPBB... 

where  the  pattern  IBBPBBPBBPBBPBB  repeats  indefinitely.  Pictures  in  an  MPEG  video 
sequence  are  organized  into  groups  to  facilitate  random  access.  Each  group  of  pictures 
(GOP)  contains  the  repeating  pattern  which  makes  it  possible  to  begin  decoding  at  inter¬ 
mediate  points  in  the  video  sequence. 

MPEG  uses  an  interframe  coding  technique  called  motion  compensation  such  that 
P-  and  B-frames  exploit  the  temporal  redundancy  present  in  a  video  sequence  and  are 
coded  with  reference  to  other  P-  and/or  I-  frames.  P-frames  update  the  picture  (using  a 
predictive  algorithm)  from  the  last  I-  or  P-frame.  B-frames  use  the  bidirectional  predic¬ 
tion  method  and  are  coded  with  respect  to  the  preceding  I-  or  P-frame  and  the  subsequent 
I-  or  P-frame  in  the  sequence.  In  general,  I-frames  are  much  larger  than  P-frames,  and  P- 
frames  are  much  larger  than  B-frames.  Therefore,  an  MPEG  decoder  that  compresses  a 
video  signal  at  a  constant  frame  rate  (e.g.,  30  frames/sec)  generates  a  coded  bit  stream 
with  a  highly  variable  instantaneous  bit  rate.  In  Figure  1.1,  an  example  of  MPEG  video 
sequence  is  shown.  Note  the  rate  fluctuations  from  one  picture  to  another.  In  some  cases, 
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Figure  1.1:  An  MPEG  video  sequence. 


the  fluctuations  can  be  in  the  order  of  10  or  more.  Consider  an  I-frame  which  is  140,000 
bits  long.  Transmitting  the  I-frame  in  1/30  second  over  a  network  would  require  a  trans¬ 
mission  capacity  of  4.2  Mbps  to  be  allocated.  Then  during  the  next  1/30  seconds,  the 
transmission  capacity  required  for  B-frame  drops  to  0.8  Mbps.  These  very  large  rate 
fluctuations  are  a  consequence  of  the  use  of  interframe  coding  techniques  in  MPEG. 
Therefore,  MPEG  constitutes  a  good  example  for  bursty  VBR  traffic.  However,  the  en¬ 
coder  output  rate,  on  average,  does  not  change  as  rapidly  as  the  frame  sizes  when  the 
scene  in  the  video  sequence  being  encoded  changes.  Pictures  of  scenes  with  more  com¬ 
plexity  and  a  lot  of  motion  require  more  bits  to  encode.  This  observation,  also  pointed  in 
[13],  is  particularly  important  since  size  of  the  same  type  of  picture  in  the  previous  GOP 
will  be  used  to  estimate  future  traffic  when  smoothing  the  MPEG  video. 

In  this  report,  MPEG  video  sequences  are  used  to  evaluate  the  performance  of  the 
proposed  algorithms.  The  data  set  consists  of  sequences  of  MPEG-1  frame  sizes  created 
at  the  Institute  of  Computer  Science  at  the  University  of  Wurzburg  [12].  In  all,  19  se- 
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quences  (sportcasts,  movies,  music  videos,  newscasts,  talk  shows,  cartoons  and  sport 
events)  of  40,000  frames  each  are  provided  in  the  data  set.  In  addition,  MPEG-1  frame 
trace  of  full  length  Star  Wars  movie  is  used  in  some  experiments  [60], 

1.3  Contributions 

Traffic  shaping  of  bursty  VBR  traffic  has  been  designed  and  implemented  to  guar¬ 
antee  conformance  of  the  traffic  to  the  negotiated  contract.  Most  multimedia  applications 
are  time-critical  and  cannot  tolerate  large  delays  contributed  by  the  traffic  shaping.  As 
described  in  Section  1.1,  multimedia  applications  have  diverse  requirements  on  the  end- 
system  in  terms  of  delay  and  bandwidth  guarantees.  On  the  other  hand,  behavior  of  the 
submitted  traffic  must  be  as  close  as  possible  to  the  ideal  traffic  desired  by  the  network 
to  efficiently  utilize  its  resources.  New  traffic  shaping  or  smoothing3  algorithms  must  be 
designed  with  the  following  functionality  to  provide  the  above  requirements: 

1)  A  wide  set  of  application  constraints  must  be  satisfied.  These  constraints  are 
usually  expressed  in  terms  of  maximum  smoothing  delay  and  buffer  size. 
Therefore,  a  unique  solution  must  be  provided  for  all  possible  set  of  constraints. 

2)  Smoothing  must  be  optimal  in  the  sense  that  given  a  delay  or  buffer  size  bound, 
bursts  of  data  must  be  spread  over  the  allowed  time  as  much  as  possible. 


3  The  term  smoothing  and  traffic  shaping  will  be  used  interchangeably  through  the  rest  of  the  report  since 
most  of  time,  shaped  traffic  is  smoother  than  the  original  traffic  because  of  the  allowed  smoothing  delay  or 
buffer. 
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3)  Smoothing  of  traffic  must  take  into  account  the  underlying  network  service  in 
order  to  submit  traffic  with  desired  characteristics  for  maximum  network  utili¬ 
zation. 

4)  Smoothing  must  address  both  stored  (off-line)  and  interactive  (real-time)  appli¬ 
cations. 

The  report  describes  the  design  and  specification  of  a  smoothing  algorithm  that 
possesses  the  above  listed  features.  The  proposed  algorithm  is  shown  to  provide  a  desired 
traffic  behavior  for  the  network  given  a  set  of  constraints  imposed  by  the  application.  The 
smoothing  algorithm  can  be  viewed  as  a  bridge  closing  the  gap  between  the  application 
and  the  network,  each  of  which  has  its  own  specific  requirements.  The  algorithm  provides 
a  unique  solution  for  all  possible  application  scenarios  by  modeling  the  whole  communi¬ 
cation  system  from  source  to  sink  and  by  incorporating  the  constraints  in  the  system 
model.  Optimality  of  the  algorithm  is  guaranteed  by  finding  the  shortest  path  through  up¬ 
per  and  lower  bounds  on  the  cumulative  rate  function  that  are  derived  from  the  given  set 
of  constraints.  Even  with  a  rudimentary  rate  prediction  rule,  the  performance  of  the  algo¬ 
rithm  is  shown  to  be  superior  over  other  algorithms  proposed  in  the  literature.  The  novel 
idea  of  choosing  the  rate  based  on  either  past  information  to  minimize  traffic  variation  or 
future  prediction  to  minimize  number  of  rate  changes  is  presented. 

Traffic  shaping  has  a  direct  impact  on  bandwidth  allocation,  therefore  traffic 
shaping  must  be  integrated  with  bandwidth  allocation.  This  novel  idea  is  presented  by 
proposing  a  new  bandwidth  renegotiation  algorithm  that  tracks  the  bandwidth  require¬ 
ments  of  the  VBR  video  source  using  a  moderate  renegotiation  rate  which  results  in 
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higher  bandwidth  efficiency.  This  approach  allows  for  deterministic  delay  bounds  and 
constant  quality  video  transmission  in  contrast  to  other  approaches  that  provides  non- 
deterministic  smoothing  delay  bounds  and  variable-quality  video  transmission  as  a  result 
of  shaping  the  traffic  at  either  encoder  buffer  or  leaky-bucket.  In  addition,  the  proposed 
scheme  provides  universal  interoperability  by  decoupling  the  source  from  the  network, 
thus  any  application  can  use  the  bandwidth  renegotiation  algorithm. 

The  effect  of  traffic  shaping  on  the  network  utilization  is  investigated  and  two 
theorems  are  presented  regarding  the  savings  in  end-to-end  delay  bound  when  all  sources 
smooth  their  traffic.  It  has  been  proven  that  when  ideal  smoothing  (smoothing  of  stored 
data  or  traffic  with  known  future)  is  applied  by  all  sources,  network  can  support  more 
connections  with  the  same  QoS  even  for  the  single-hop  case  which  indicates  the  optimal¬ 
ity  of  the  proposed  smoothing  algorithm. 

Finally,  a  novel  concept  called  aggregate  smoothing  is  introduced  which  integrates 
multiplexing  of  multiple  video  sources  with  traffic  shaping.  It  is  shown  that  number  of 
rate  changes  and  variation  of  the  aggregate  rate  are  significantly  reduced  that  allows 
RCBR  network  service  to  be  a  cost-effective  solution  for  real-time  traffic  when  multiple 
traffic  can  be  transmitted  as  a  bundle.  This  result  is  particularly  important  for  public  net¬ 
works  carrying  aggregated  traffic  since  network  utilization  is  shown  to  increase  with  the 
use  of  aggregate  smoothing. 

The  remainder  of  the  report  is  organized  as  follows.  Chapter  2  describes  a  lossless 
smoothing  algorithm  for  isochronous  VBR  traffic  given  a  set  of  constraints  in  terms  of 
maximum  smoothing  delay  and  buffer  size.  The  performance  of  the  proposed  algorithm  is 
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evaluated  using  MPEG  video  sequences  and  is  compared  to  the  previous  work.  In  Chap¬ 
ter  3,  the  effect  of  smoothing  on  deterministic  end-to-end  performance  guarantees  for 
packet-switching  networks  is  investigated.  Chapter  4  introduces  a  new  concept  called  ag¬ 
gregate  smoothing  which  integrates  traffic  shaping  with  multiplexing  of  multiple  video 
sources  in  order  to  smooth  the  aggregate  rate.  A  discussion  about  the  effect  of  aggregate 
smoothing  on  network  utilization  is  also  provided.  Finally,  Chapter  5  concludes  the  report 
with  a  summary  of  the  presented  work  and  some  research  issues  for  future  work  using  the 
framework  developed  in  this  report. 
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Chapter  2 

Lossless  Smoothing  Algorithms 
for  VBR  Traffic 


2.1  Introduction 

The  transfer  of  compressed  data  is  demanding  in  terms  of  network  Quality  of 
Service  (QoS)  requirements  since  interffame  compression  techniques,  such  as  those  used 
in  MPEG  video,  lead  to  a  very  bursty  bit  stream  which  complicates  the  problem  of  net¬ 
work  resource  management.  The  effect  of  smoothing  traffic  sources  on  QoS  and  network 
utilization  has  been  an  important  issue  in  providing  end-to-end  performance  guarantees 
[43-46].  Several  techniques  have  been  developed  to  control  the  rate  fluctuations  of  video 
in  order  to  alleviate  congestion  and  to  increase  network  utilization.  Some  of  these  tech¬ 
niques  are  lossy  [32-34],  and  are  inappropriate  for  smoothing  rate  fluctuations  that  are  the 
consequence  of  interffame  compression.  The  problem  of  smoothing  satisfying  a  delay 
bound  D,  was  analyzed  by  [35],  assuming  all  picture  sizes  are  known  a  priori  and  where 
the  selection  of  rate  is  designed  such  that  the  number  of  rate  changes  over  time  is  mini¬ 
mized.  A  similar  algorithm  based  on  the  one  in  [35]  that  allows  specification  for  two 
more  parameters,  K,  the  number  of  pictures  with  known  sizes,  and,  H,  a  lookahead  inter- 
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val  was  proposed  in  [13]  to  improve  algorithm  performance.  Both  algorithms  consider 
picture  delay  only  at  the  server  and  no  parameter  for  maximum  buffer  size  is  provided. 
Another  work  includes  three  algorithms  to  smooth  VBR  MPEG  video  in  the  presence  of  a 
leaky  bucket  ATM  network  access  controller  [49].  Optimal  bandwidth  allocation  algo¬ 
rithms  have  been  extensively  studied  in  [36]  where  only  stored  video  is  assumed  for  de¬ 
livery.  In  [20]  causal  and  non-causal  algorithms  are  presented  that  find  optimal  allocation 
from  a  given  set  of  discrete  service  rates  minimizing  the  total  cost,  subject  to  either  buff¬ 
ering  or  delay  constraint.  The  non-causal  algorithm  has  fairly  high  runtime  complexity 
which  becomes  impractical  when  the  number  of  available  service  rates  is  large  and  the 
enforcement  of  using  a  set  of  discrete  service  rates  is  a  limitation  on  the  optimality  of 
the  solution.  Other  work  consider  the  case  of  smoothing  in  terms  of  statistical  perform¬ 
ance  guarantees,  for  example  as  in  [47],  Smoothing  is  done  by  periodic  averaging  of  a 
source’s  rate  (PARing)  and  large  deviations  techniques  are  then  used  to  determine  the 
buffering  requirements  at  the  source  and  the  loss  probabilities  inside  the  network. 

One  common  feature  of  the  previous  approaches  is  that  a  solution  is  optimized  for 
only  a  specific  set  of  delivery  requirements,  e.g.,  either  from  buffering  or  delay  point  of 
view.  This  makes  it  difficult  to  modify  a  proposed  algorithm  for  a  new  set  of  require¬ 
ments  other  than  the  one  it  is  specified  for.  Another  desired  feature,  that  is  lacking  in  pre¬ 
vious  work  is  the  ability  to  incorporate  network  feedback  into  the  algorithm  to  adapt  to 
changes  in  QoS  of  the  underlying  network  service.  This  necessitates  the  use  of  both 
server  and  client  buffer  status  information  to  be  utilized  in  the  algorithm  for  its  adaptive 
behavior. 
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In  this  chapter,  causal  and  non-causal  smoothing  algorithms  for  lossless  transmis¬ 
sion  of  compressed  video  data  are  introduced.  Compressed  video  is  bursty  in  nature  due 
to  the  interframe  compression  techniques  used  by  the  encoder.  A  deterministic  approach 
that  does  not  allow  the  data  to  be  discarded  at  the  end  hosts  is  considered.  The  causal  al¬ 
gorithm  is  characterized  by  a  set  of  parameters  expressing  delivery  requirements  that  in¬ 
clude  maximum  server  and  client  buffer  sizes,  delay  bounds,  lookahead  interval  and 
number  of  pictures  with  known  sizes.  From  the  specified  parameters,  upper  and  lower 
bounds  on  the  cumulative  transmitted  bit  rate  are  derived.  Shortest  path  through  these 
bounds  gives  the  minimum  slope,  therefore  the  minimum  number  of  rate  changes.  Since 
this  solution  is  non-causal,  prediction  techniques  based  on  past  history  should  be  used  to 
determine  the  picture  sizes  in  the  future  for  live  data  applications. 

The  approach  in  the  design  of  the  proposed  algorithm  allows  it  to  address  a  wide 
range  of  application  scenarios  ranging  from  a  live  video-conference  application  with 
small  delay  requirement  in  the  order  of  milliseconds  to  a  video-on-demand  application 
where  the  delay  can  be  huge  in  the  order  of  minutes  due  to  large  buffers  used  at  the  cus¬ 
tomer  site.  In  the  first  scenario,  the  burstiness  of  the  network  bandwidth  requirement  is 
controlled  by  delaying  data  at  the  encoder  buffer,  whereas  in  the  second  scenario,  a  pre¬ 
fetch  buffer  at  the  receiver  is  filled  in  advance  of  each  burst  by  delivering  more  data 
across  the  network  than  needed,  and  drained  in  the  course  of  the  burst.  The  performance 
of  the  algorithm  has  been  demonstrated  to  be  effective  and  in  some  cases  better  than  other 
proposed  techniques  designed  and  optimized  for  that  particular  application  requirements. 
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Figure  2.1:  System  model  for  smoothing  algorithm. 


This  chapter  is  organized  as  follows.  In  Section  2.2,  the  system  model  is  introduced 
and  the  boundary  conditions  on  the  cumulative  transmitted  bit  rate  are  derived  in  order  to 
satisfy  buffer  and  delay  constraints.  In  Section  2.3,  a  causal  smoothing  algorithm  which 
finds  the  shortest  path  through  the  bounds  in  order  to  minimize  the  number  of  rate 
changes  is  introduced.  In  Section  2.4,  experimental  results  of  the  original  causal  algo¬ 
rithm  are  presented  and  an  improvement  to  the  algorithm  which  further  minimizes  the 
number  of  rate  changes,  is  described.  The  performance  of  the  algorithm  is  evaluated  with 
respect  to  other  techniques  proposed  in  [13,  35,  48,  49].  The  conditions  for  which  buffer 
or  delay  constraint  should  be  used  to  obtain  the  optimal  performance  from  non-causal  al¬ 
gorithm  are  also  specified.  Finally,  Section  2.5  has  some  concluding  remarks. 


2.2  System  Model 

A  video  sequence  is  assumed  to  be  displayed  at  the  rate  of  1/T  pictures  per  second 
where  T  is  the  picture  period.  The  foundation  of  the  proposed  model  is  based  on  the  work 
in  [37]  where  the  constraints  on  the  output  bit  rate  from  the  server  are  derived  to  ensure 
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that  the  server  and  client  buffers  do  not  underflow  or  overflow.  The  delivery  requirements 
can  be  divided  into  two  groups: 

a)  delay  bound  on  the  pictures  at  the  server  and/or  client  buffer(s). 

b)  server  and/or  client  buffer(s)  do  not  overflow  and/or  underflow. 

Although  the  algorithm  described  in  this  chapter  is  intended  to  be  used  for  com¬ 
pressed  video  which  is  isochronous  in  nature,  any  arbitrary  bit  stream,  be  it  periodic  or 
non-periodic,  can  utilize  the  algorithm  by  dividing  time  into  minimum  units  of  T  seconds 
which  indicate  the  amount  of  data  arrived  within  that  period  or  in  other  words,  by  sam¬ 
pling  the  input  traffic  every  T  seconds.  The  value  of  T  should  be  chosen  to  represent  the 
unit  of  time  which  is  small  enough  to  bound  delay  of  the  traffic  that  arrives  in  a  period, 
e.g.,  for  distributed  real-time  simulation  processes  requiring  bounded  delay  on  each  mes¬ 
sage  they  interchange  with  processes  in  other  hosts,  T  determines  the  minimum  message 
delay.  Through  the  rest  of  the  report,  the  term  picture 1  will  be  used  to  denote  the  arrived 
data  within  the  period  T  and  the  source  traffic  is  assumed  to  be  variable  bit  rate  com¬ 
pressed  video.  The  encoding  of  the  video  is  open-loop  with  no  rate-control  feedback 
from  the  smoothing  algorithm  differing  from  the  techniques  that  adjust  encoder’s  quan¬ 
tizer  scale  resulting  in  lower  bit  rate  at  the  expense  of  poorer  visual  quality  [32-34,  50]. 
The  introduced  algorithm  is  lossless  in  the  sense  that  no  data  is  discarded  by  the  smooth¬ 
ing  algorithm  and  it  is  assumed  that  lower  layers  are  responsible  for  discarding  or  delay¬ 
ing  data,  e.g.,  tagging  cells  at  the  user-network  interface  when  smoothed  traffic  is  not 


1  In  this  dissertation  and  the  literature,  the  terms  frame  and  picture  are  often  used  interchangeably. 
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conforming  to  the  user  specified  parameters  or  discarding  cells  queued  inside  the  network 
during  the  congestion. 

The  model  for  rate  smoothing  is  similar  to  the  one  described  in  [39]  which  consists 
of  two  FIFO  queues,  one  with  input  from  the  output  of  an  encoder  (or  storage  system) 
and  output  to  network  and  other  with  input  from  the  network  and  output  to  decoder  (see 
Figure  2.1).  It  is  assumed  that  encoding  (decoding)  time  or  data  transfer  time  from  stor¬ 
age  disk  to  buffer  is  less  than  or  equal  to  T  seconds.  Let  E(t)  denote  the  output  rate  of  the 
encoder  at  time  t  and  Et  (i  =1 ,  2,  3  . . .),  the  number  of  bits  in  the  interval  [(i  -  1)T,  iT) 
referring  to  the  size  of  picture  i  and  similarly,  let  R(t)  denote  the  transmitting  bit  rate  at 
time  t  and  Rf ,  the  number  of  bits  that  are  transmitted  during  the  interval  [(i  -  1)T,  iT) . 
Let  Be  (t)  and  Bd  (t)  denote  the  instantaneous  fullness  (the  amount  of  data  in  the  buffer) 
of  the  server  and  client  buffers,  respectively.  The  server  buffer  receives  bits  at  rate  E(t) 
from  the  encoder  and  outputs  bits  at  rate  R;  bits  per  period.  The  client  buffer  receives 
bits  at  rate  R(t)  from  the  network  and  drains  the  buffer  at  rate  E; .  R(t)  is  a  piece- wise 
constant  function  with  possible  rate  changes  occurring  periodically  at  specific  locations 
in  time,  e.g.,  at  the  beginning  of  the  period.  This  is  somewhat  different  from  the  approach 
in  [13,  35,  48]  which  updates  the  rate  that  will  be  used  for  all  of  the  data  belonging  to 
each  picture  in  an  aperiodic  manner.  For  each  interval  with  constant  rate,  it  is  assumed 
that  packets  or  cells  are  scheduled  to  be  sent  uniformly  spaced  in  time. 
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2.2.1  Delay  Constraints 


End-to-end  delay  of  a  picture  consists  of  components  such  as  time  spent  at  the 
server  and  client  buffers  and  network  access  and  transmission  delays.  For  live-video  ap¬ 
plications  most  of  the  delay  is  at  the  server  since  the  server  cannot  send  bits  faster  than 
the  encoder  can  produce  them,  whereas  for  stored-data  applications,  a  picture  can  be  pre¬ 
fetched  at  the  client  buffer  well  before  its  display  time. 

It  is  assumed  that  end-to-end  delay  of  a  picture  i,  Dtot ,  is  constant  and  defined  as 
Dtot  =  D*  +  Dnet  +  Df  where  D'  and  Df  are  the  delays  at  the  server  and  client  buffers 

respectively  and  are  defined  later  in  this  chapter,  and  Dnet  is  the  network  delay.  Let  K  be 
the  number  of  pictures  in  the  buffer  ready  to  be  sent  to  the  network.  The  entire  picture  is 
assumed  to  have  arrived  to  the  buffer  at  the  beginning  of  each  period  before  the  bits  of 
that  picture  can  be  transmitted  so  K  =  1  is  the  minimum  number  of  pictures  that  must  be 
in  the  buffer  before  transmission  can  begin2.  If  the  last  bit  of  picture  i  is  sent  at  the  time 
Sj ,  then  the  delay  of  the  picture  at  the  server  buffer  is  given  by 

D-  =  Sj  +  (K  +  1)T  -  iT  (2.1) 

which  includes  the  picture’s  encoding,  queuing  and  sending  delay.  A  constant  delay  T,  is 
included  in  the  definition,  to  denote  the  upper  bound  on  the  encoding  delay,  defined  as 
the  difference  between  the  frame  capture  time  and  the  time  it  is  fully  encoded.  Note  that 
in  an  actual  system,  the  encoding  of  picture  may  take  less  than  T,  in  which  case  the  delay 


2  This  condition  is  also  required  to  guarantee  no  violation  of  delay  and  buffer  constraints  as  it  will  be  ex¬ 
plained  later. 
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of  each  picture  may  be  smaller  than  the  calculated  value  using  (2.1)  but  the  difference 
would  be  negligible. 

At  the  decoder,  a  new  time  index  x  is  defined,  which  is  zero  when  the  first  bit  ar¬ 
rives  at  the  client  and  x0  as  the  time  when  the  decoder  starts  decoding  the  first  picture. 
The  delay  of  picture  i  at  the  client  is  given  by 

Df  =x0+(i  — l)T-aj  (2.2) 

where  a .  is  the  arrival  time  of  that  picture  to  the  client  buffer  (a;  =  aM  +  s,  -  sM)  and 

Ddmax+T>x0>ar 

Let  De  and  Dd  be  the  bounds  on  picture  delay  at  the  server  and  client  buffers. 
For  a  given  end-to-end  delay,  the  choice  of  parameter  D'nax  determines  average  buffer 
occupancy  at  the  server.  Because  of  the  conservation  of  the  bits  in  the  transmission  pipe, 
a  lower  buffer  occupancy  at  the  server  implies  higher  buffer  occupancy  at  the  client.  If 
Dtot  is  constant,  then  Ddmax  =  Dtot  -  Dnet  -  (K+ 1)T.  The  choice  of  x0  determines 
whether  data  can  be  prefetched  to  the  client  before  it  is  generated  at  the  encoder.  For  live- 
video  applications,  x0  =  D^ax  +T  must  be  chosen  since  the  server  can  not  send  faster 
than  the  encoder  and  any  smaller  value  results  in  starvation  of  the  client  buffer.  However 
for  prerecorded  data,  smaller  value  of  x0  is  possible  since  data  can  be  prefetched  at  the 

client  buffer.  Basically,  Ddax  +T-x0  determines  maximum  number  of  pictures  that  can 
be  prefetched  to  the  client  buffer.  The  choice  of  x0  should  also  include  network  jitter 
when  network  delay  is  not  constant.  Let  D™*x  and  D”*,  be  the  maximum  and  minimum 
network  delays.  The  time  when  decoding  of  the  first  picture  starts  must  include  extra 
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delay  of  (dJ^-D™*,)  in  the  case  of  variable  delay.  So  t0  must  be  chosen  as 
t0  >  a,  +  (d“‘x  -  D“‘n )  in  order  to  compensate  for  network  jitter. 

Assuming  T  =  1 ,  V  and  Ld  are  defined  as  Le  =  f* Djnaf.  -  (K  + 1)]  and 

Ld=Kax-^o+ll  where  Demax>(K+l)  and  D^>0,  se  =  Le  -(D^ -(K  +  l)) 
and  sd  =  Ld  -  (D^ax  -  t0  + 1) .  At  t  =  0 ,  K  pictures  are  available  and  ready  to  be  trans¬ 
mitted  (encoding  of  the  first  picture  is  assumed  to  start  at  t  =  -K).  Each  boundary  con¬ 
dition  on  the  cumulative  transmitted  bit  rate  is  expressed  as  (x(  ,y; }  where  xf  denotes  the 
time  at  which  boundary  condition  i  applies  and  y,  for  the  value  of  the  bound. 

Lemma  2.1  The  following  set  of  upper  and  lower  bounds  on  the  cumulative  transmitted 
bit  rate  guarantee  that  D*  <  Djnax  and  Dd  <  Ddiax : 


=  | (xi ’ y i > •  xi  =i-se,  y,  =  £ej 


for  V  <  i  <  N , 


and 


i+Lr 

UD  =|(xi,yi> :  X;  =i  +  £d,  ys  =  XEj 


for  1  <  i  <  N  -  Ld 


Proof: 


Lower  bound:  Using  (2.1),  the  last  bit  of  the  first  picture  must  be  sent  before 


^max 

t  =  D^ax  -  K  which  requires  |R(s)ds  >  Ej  for  the  transmitted  bit  rate.  For  picture  i. 


the  last  bit  must  be  sent  at  t  =  D^ax  +  (i  -  (K  + 1))  that  can  be  satisfied  if 
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Dmax+>“(K+1)  i  i+Le  >+Le  i 

jR(s)ds>XEj  or  SRj_  jR(s)ds>XEj  from  which 

o  j=1  j=1  D^x+i-(K+l)  j=1 

i  i  i-Le 

Rj  -  |R(s)  ds  >  ^]  Ej  can  be  obtained  providing  the  lower  bound  on  the  cumulative 

j=l  i-Ee  j=l 

transmitted  bit  rate  at  t  =  i  -  se . 

Upper  bound:  Using  (2.2),  arrival  time  of  picture  i  must  satisfy  a{  >  x0  +  (i  - 1)  -  D„ax . 

ai  i  T0  +(^  —  0”Dmax  j 

Since  jR(x)dx  =  XEj  ’  then  jR(x)dx  <  XEjis  obtained.  This  can  be  written  as 

0  j=l  0  j=l 

i-Ld  ^o+(i->)-Dmax  i  i  i+£d  i+Ld 

^Rj  +  |R(x)dx<XEj  or  ^Rj  +  j*R(x)dx  <  XEj  which  provides  the  upper 

j=l  i-Ld  j=1  j=1  i  j=1 

bound  on  the  cumulative  transmitted  bit  rate  at  x  =  i  +  ed . 

When  D^ax  and  (D^ax  -  x0)  are  integer  values,  X;  =  i  is  obtained  for  both  upper  and 
lower  bounds  which  gives: 

i-L'  i  i+Ld 

ZEjSXRjSZe,  (2.3) 

j=l  j=l  j=l  ' 

for  Le  <  i  <  N  -  Ld . 

It  should  be  noted  that  when  Ld  =  0  and  Le  >  0 ,  server  is  not  allowed  to  send 
faster  than  the  encoder  and  in  the  case  of  Le  =  0  and  Ld  >  0  data  is  prefetched  to  the  cli¬ 
ent  buffer. 
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2.2.2  Buffer  Constraints 


The  following  conditions  must  be  met  in  order  to  ensure  sender  and  client  buffers 
do  not  overflow  or  underflow: 

BV  <  B'  (t)  <  Vt  and  0  <  Bd  (t)  S  Bl  Vt .  (2.4) 

where  Be  and  B' ...  denote  the  maximum  and  minimum  server  buffer  sizes  respectively, 
and  B^ax  denotes  the  maximum  client  buffer  size.  It  should  be  noted  that  B^in  can  be 
negative  which  indicates  the  amount  of  data  that  can  be  prefetched  to  the  client  buffer.  In 
the  case  of  network  jitter,  the  extra  buffer  at  the  client  is  also  required  since  additional 

t+D“  -Dig, 

bits  may  arrive  with  shorter  delay.  For  safety  margin,  B^ax  -  max  j*R(s)ds  must  be 

t 

used  as  the  maximum  client  buffer  size  when  computing  the  bounds. 

Assuming  that  there  are  K  pictures  available  in  the  buffer  at  t  =  0 ,  buffer  occu- 

u  K 

pancy  at  the  server  after  encoding  picture  i  is,  B®  =  Be  (i)  =  J[E(s  +  K)  -  R(s)]  ds  +  ^  E  j 

0  j=* 

which  can  be  written  as 

B'-ix+ixK-ix  (2.5) 

j=i  j=i  j=i 

at  t  =  i  when  i  >  1 . 

For  the  client  buffer,  the  time  index  x  is  used  which  is  defined  as 
t  =  x  +  network  delay  (t  =  0  when  first  bit  arrives  at  the  client).  At  time  x0 ,  decoding  of 

the  first  picture  starts.  Let  L  =  |"t0]  .  The  client  buffer  fullness  when  t  =  x0  is  given  by 
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B0  =  SRj  +  |R(s)ds.  The  client  buffer  fullness  at  time  x  =  i  +  x0  is  then  given  by 

j=l  (L-l) 


.-r  lq 

Bd  =  +  jR(x)dx  -  Ej .  If  this  expression  is  rewritten  recursively  and  i  is  substituted 


with  i  +  L ,  the  following  is  obtained: 


B?-L=ZRj+  jRWdc-SE, 


at  x  =  i  -  L  +  xn  when  i  >  L. 


The  following  lemma  provides  the  set  of  upper  and  lower  bounds  for  buffer  constraint. 
Lemma  2.2  The  following  set  of  upper  and  lower  bounds  on  the  cumulative  transmitted 
bit  rate  guarantees  that  B‘ran  <  Be  (t)  <  B^iax  Vt  and  0  <  Bd  (t)  <  B^ax  Vt : 


Lsb  =  l(x„y,}:  xj  =i,  y;  =  ZEj  +  2X+k  ~Bn 

j=i  j=i 


for  i  >  1 , 


Lcb  =|(xi,yi):  X;  =  i  —  L  +  Tq ,  y{  =  XEj 


for  i  >  L , 


U|  =  |(x1,yi) :  Xj  =  i,  y,  =EEj  +  Zej+k  “bL 
I  j=i  j=i 


for  i  >  1 , 


Ub  =l(x„yi)  :  x;  =  i-L  +  x0,  y;  =2X+Bd 


for  i  >  L . 


Proof: 
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Using  (2.4),  (2.5)  and  (2.6),  the  following  relations  can  be  derived  to  guarantee  that 


neither  the  server  nor  client  buffers  overflow  or  underflow  : 

K  Ej+iXK-Bi.sixsi; 

j=i  j=i  j=i  j=i 

and 

i-L  i-1  i-L+t0  i-L 

£Ej<£Rj+  }R(x)dT<XEj+B^ax  at  t  =  i-L  +  x0  when  i  >  L. 

j=l  j=l  i-l  j=> 

The  rest  of  the  proof  is  obvious. 

Finally  U  =  U|  U  UB  U  UD  and  L  =  LSB  U  LB  U  LD  provide  the  set  of  upper  and 
lower  bounds  for  the  given  buffer  and  delay  constraints.  A  special  case  is  when  D®nax , 
(D!a*  _  xo)  and  To  are  integers  aligning  all  the  bounds  at  xf  =  i .  Then  the  bounds  can  be 
expressed  as: 


Ej+ZEj+K-Bmin  at  t  =  i  when  i  >  1 

j=i 


L,<2RjiUi. 


(2.7) 


where 


i-Le  K 


i-L 


L:  =  max  I  Z  Ej  ,  Z  Ej  +  Z1  Ej+k  Bmax ,  ^  Ej 


7=1  7=1  7=1 


and 


7=1 


i-L 


U+Ld  K  i 

Ui  =  min)  Z  Ej  ’ Z  Ej  +  Z  Ej +K  ~  BL  >HEj+  B 


d 

max 


2.3  Smoothing  Algorithm 

In  this  section,  it  is  described  how  to  find  the  optimal  path  through  the  derived  up¬ 
per  and  lower  bounds  using  the  shortest  path  algorithm.  First,  the  shortest  path  algorithm 
which  is  the  key  to  the  smoothing  algorithm  is  introduced.  Then  a  causal  smoothing  al¬ 
gorithm  specified  by  the  following  parameters  is  introduced:  delay  bounds  and  buffer 
sizes  at  the  server  and  client,  the  number  of  pictures  with  known  sizes  and  look-ahead 
interval  for  improving  the  algorithm  performance. 

2.3.1  Shortest  Path 

In  Section  2.2,  necessary  upper  and  lower  bounds  on  the  cumulative  transmitted  bit 
rate  were  derived.  The  purpose  of  the  rate  smoothing  algorithm  is  to  find  an  optimal  path 
through  the  bounds  such  that  a  minimum  number  of  rate  changes  is  obtained.  Since  the 
slope  of  a  path  through  the  bounds  gives  the  instantaneous  rate,  optimal  path  should  have 
the  minimum  slope  and  the  shortest  length  to  minimize  number  of  rate  changes.  Fur¬ 
thermore,  this  path  will  minimize  the  maximum  rate  since  it  will  have  the  smallest  possi¬ 
ble  slope.  The  shortest  path  algorithm  described  in  this  section  is  based  on  the  work  in 
[38]  where  the  problem  of  constructing  a  Euclidean  shortest  path  between  two  specified 
points,  which  avoids  a  given  set  of  barriers,  is  addressed. 

The  shortest  path  algorithm  uses  two  chains  branching  at  some  vertex,  called  a 
cusp.  One  chain  includes  a  set  of  vertices,  Ut ,  i  =  1,2,...N ,  each  of  which  belongs  to 
the  set  of  upper  bounds  and  the  other  chain  includes  vertices,  Z, ,  from  the  set  of  lower 
bounds.  The  source  and  destination  points  are  specified  as  5  and  t  respectively.  D(v; ,  vy) 
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is  defined  as  the  shortest  path  from  v,.  to  Vj  and  Ds  as  the  union  of  the  two  chains;  i.e., 
Df  =  D (^, Ui )  U  D (s, Z. ) .  In  [38],  it  is  shown  that  both  D(s,  Lf)  and  D(s,  Lt)  must  be 
inward-convex  polygonal  chains;  i.e.,  it  is  convex  with  convexity  facing  toward  the  inte¬ 
rior  of  the  area  bounded  by  the  upper  and  lower  boundary  points.  The  algorithm  succes¬ 
sively  constructs  D15D2,...DN  and  finally  D(s,t)  ensuring  the  inward-convexity  of  the 
upper  and  lower  chains.  In  detail: 

1.  The  algorithm  constructs  Dj  by  connecting  s  to  U,  and  L,,  s,  =  s,  k  =  1 . 

2.  General  Step  (Construct  Di+I  from  D; ):  The  algorithm  first  creates  D(sk, 
Ui+I)  (or  D(sk,  Li+I))  by  scanning  all  vertices  in  the  D(sk,  Ut)  beginning 
from  Ut .  Two  cases  are  distinguished: 

(i)  a  vertex  U }  ( 1  <  j  <  i )  is  found  in  D(  sk ,  Ui )  such  that  when  connected 
with  Ui+I ,  the  inward  convexity  is  not  destroyed.  All  other  vertices 
between  U }  and  Ui+l  are  deleted  if  any. 

(ii)  if  no  such  a  vertex  is  found  in  D(  sk ,  Ui ),  then  D(  sk ,  Lt )  is  scanned 

until  one  vertex,  Lm  ( 1  <  m  <  i ),  is  found  such  that  LmUj+]  is  inside 
the  polygon  and  sk+1  =  Lm.  D {Lm,  Ui+l )  and  D(  Lm ,  L: )  become  the 
new  upper  and  lower  chains.  D(sk , Lm)  gives  the  shortest  path  be¬ 
tween  sk  and  Lm  and  k  is  also  incremented;  k  =  k  +  1 . 

The  general  step  is  applied  to  D(sk ,  L )  to  construct  D(sk ,  Li+1 ). 
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3.  Final  Step:  Once  DN  has  been  constructed,  the  general  step  is  applied  to  con 


Figure  2.2:  Illustration  of  the  shortest  path  algorithm.  In  (a)  two 
chains  are  preserved  and  in  (b)  a  new  cusp  is  found. 


The  details  and  proof  of  the  algorithm  can  be  found  in  [38].  In  Figure  2.2,  the 
shortest  path  algorithm  is  illustrated  where  d,  is  the  diagonal  between  the  two  extreme 
points  of  upper  and  lower  chains.  Figure  2.2(a)  corresponds  to  case  (i)  of  the  general  step 
where  upper  and  lower  chains  are  preserved.  The  chains  are  constructed  in  the  order  of 

Sj  U , ,  s,  L, ,  s,  U,  U2 ,  s,  L2 ,  Si  U3  and  s,  L2  L3 .  Figure  2.2(b)  illustrates  the  case  (ii) 
of  the  general  step  where  a  new  cusp  is  found.  First,  s,  U , ,  s,  Lt ,  s,  U,  U2  and  s,  L2  are 

constructed.  However,  there  is  no  vertex  in  s,  U ,  U2  which  can  keep  the  inward- 
convexity  of  the  upper  chain  when  connected  with  U3  so  L2  becomes  the  new  cusp,  s2 . 
s2  JJ 3  and  s2  L3  become  the  new  upper  and  lower  chains  respectively. 

The  running  time  of  the  algorithm  for  N  upper  and  lower  bounds  besides  5  and  t  is 
analyzed  as  follows:  case  (i)  of  the  general  step  takes  constant  time  on  the  average  since 
at  one  extreme,  U =  Ui  for  V  i  which  takes  one  comparison  at  every  step  (in  total  N 

comparisons)  and  at  the  other  extreme  U 3  =  sk  for  V  i  and  V  k  which  takes  two  com¬ 
parisons  (in  total  2 N  comparisons).  Case  (ii)  may  involve  scanning  a  large  number  of 
vertices;  however  once  a  vertex  has  been  scanned  and  the  corresponding  angle  has  been 
found  to  require  continuation  of  the  scanning  process,  that  vertex  is  eliminated  from  con¬ 
sideration  since  it  belongs  to  the  shortest  path  whether  it  becomes  the  new  cusp  or  not.  So 
the  algorithm  runs  in  time  0(N). 

2.3.2  Causal  Algorithm  Design  and  Specification 
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The  shortest  path  determines  noncausally  a  path  through  the  bounds.  For  stored- 
data  applications,  smoothing  is  realized  by  simply  applying  the  shortest  path  algorithm 
to  the  pre-computed  bounds  using  (2.7).  However,  this  solution  can  not  be  implemented 
for  live  video  applications  since  the  size  of  future  pictures  is  not  known.  In  this  case, 


Smooth  (int  H,  int  K,  int  N,  int  De ,  int  Dd ,  int  B'max ,  int  Bemin ,  int  Bd  „  ,  int  t0  ,  int  *pic_size)  { 

int  i,  up_d=0;  low_d=0,  up_b_enc,  low_b_enc,  up_b_dec=0,  low_b_dec=0; 

int  temp_up_d,  tempjowd,  temp_up_b_enc; 

int  temp_low_b_enc,  temp_up_b_dec,  temp_low_b_dec; 

int  Ld  ,  Le ,  L,  init_buf=0; 
float  rate,  sum=0; 

coordinate  upper_bound[H+l],  lower_bound[H+2]; 

Ld  =  Dd-  t0  +  1; 

Le=  De  -  (K+l) ; 

L  =  x0; 

for(i=l;i<K+l;i++) 

init_buf+=size(i,i); 

iip  b  cne  =  init  buf  -  B'nin ; 

low  b  ene  =  init_buf  -  Bj^ax ; 

uP_b_dec  =  Bd  „  ; 

rate  =  pic_size[l]/(  V  +  Ld  );  /*  choose  an  initial  rate  */ 
for  (i=l ;  i<(N+  Lc  +  Ld );  i++)  { 

upper_bound[0].x  =  lower_bound[0].x  =  0;  /*  lower_bound[0]  and  upper_bound[0] 

upper_bound[0].y  =  lower_bound[0].y  =  sum;  includes  source  point  */ 
lower_bound[H+2].x  =  H+l ;  /*  destination  is  at  lower_bound[H+l]  */ 

lower_bound[H+2].y  =  source.y  +  (H+l)*rate;  /*  rate  should  not  change  if  possible  */ 
temp_up_d  =  up_d;  temp_up_b_enc  =  up_b_enc;  temp_up_b_dec=up_b_dec; 
temp_low_d  =  low_d;  temp_low_b_enc  =  low_b_enc;  temp_low_b_dec  =  low_b_dec; 
for  (j=0;  j<H,  j++)  { 

upper_bound[j].x  =  lower_boundO].x  =  j+1 ; 

upper  boundUl-y  =  min(  temp_up_d+=size(i+j+  Ld  ,  i+K),  temp_up_b_enc+= 

size(i+j+K,  i+K),  temp_up_b_dec+=size(i+j-L,  i+K) ); 

lower_bound[j].y  =  max(  temp_low_d+=size(i+j-  Le ,  i+K),  temp_low_b_enc+= 
size(i+j+K,i+K),  temp_low_b_dec+=size(i+j-L,  i+K) ); 

} 

rate  =  fmd_shortest_path  (upper_bound,  lower  bound,  H); 
notify(rate,  i);  /*  notify  server  the  rate  for  period  i  */ 
sum  =  sum+rate; 

up_d  =  up_d+pic_size[i+  Ld  ];  low_d  =  low_d  +  pic_size[i-  L'  ]; 

up_b_enc  =  up_b_enc+pic_size[i+K];  low_b_enc  =  low_b_enc  +  pic_size[i+K]; 

up_b_dec  =  up_b_dec+pic_size[i-L] ;  low_b_dec’=  low_b_dec+pic_size[i-L]; 

}  " . 

> 

Figure  2.3:  Specification  of  the  causal  smoothing  algorithm. 
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bounds  are  derived  using  estimates  of  future  picture  sizes.  The  rate  prediction  problem 
has  been  extensively  studied  from  several  different  perspectives  in  the  past  which  include 
using  linear  (Kalman)  prediction  [40],  an  artificial  neural  network  based  approach  [25], 
using  AR(l)-based  bandwidth  estimator  [20]  and  other  promising  methods  as  described 
in  [41].  The  previous  work  showed  that  a  very  simple  prediction  rule  using  the  past  in¬ 
formation  leads  to  satisfactory  performance  for  the  purpose  of  smoothing  [13,  35].  Espe¬ 
cially  for  large  delay  bounds,  smoothing  is  not  very  sensitive  to  the  estimation  error  re¬ 
sulting  from  imperfect  forecasting.  The  causal  algorithm  is  designed  independently  from 
the  estimation  rule.  Since  any  estimation  method  is  most  effective  for  pictures  in  the  near 
future,  only  the  bounds  within  a  look-ahead  window  are  computed  using  the  picture  size 
estimates.  The  causal  algorithm  is  designed  with  a  parameter  H  which  specifies  the  size 
of  the  look-ahead  window.  The  optimal  value  of  H  depends  on  the  estimation  method  and 
source  bit  rate  characteristics.  For  x;  =  i ,  the  look-ahead  window  is  shifted  by  a  picture 
period  at  every  step  and  possible  rate  changes  occur  only  at  the  beginning  of  period.  For 
non-integer  case  of  x; ,  step  size  is  the  time  difference  between  two  consecutive  boundary 
points  and  rate  changes  occur  only  at  x; .  The  condition  K  >  1  that  there  must  be  at  least 
one  picture  with  known  size  within  the  look-ahead  window  is  imposed  since  the  sequence 
of  upper  and  lower  bounds  using  the  estimates  of  picture  sizes  may  violate  delay  and 
buffer  constraints  in  the  case  of  large  estimation  errors.  By  ensuring  that  the  size  of  the 
current  picture  is  known,  it  is  guaranteed  that  the  computed  smoothed  rate  will  not  violate 
the  bounds. 
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In  Figure  2.3,  a  description  of  the  causal  algorithm  is  presented  for  input  parame¬ 
ters  with  integer  values.  It  is  assumed  that  T  =  1 .  The  parameters  to  the  smoothing  func¬ 
tion  include  H  for  look-ahead  window  size,  K  for  the  number  of  pictures  with  known 
sizes  in  the  server  buffer,  N  for  the  number  of  pictures  to  be  smoothed,  De  for  the  delay 
bound  at  the  server,  Dd  for  the  delay  bound  at  the  client,  B^ax  and  B^jn  for  the  maxi¬ 
mum  and  minimum  buffer  sizes  at  the  server,  B^ax  for  the  maximum  buffer  size  at  the 
client  and  x0  for  the  time  when  decoding  starts.  There  are  two  functions  in  the  specifi¬ 
cation:  size(ij)  which  returns,  at  period  j,  either  the  actual  size  of  picture  i,  or  an  esti¬ 
mated  size  depending  on  parameter  K  for  1  <  i  <  N ,  otherwise  zero  is  returned.  The  func¬ 
tion  find  shortest _path()  finds  the  shortest  path  through  the  bounds  in  the  look-ahead 
window  and  returns  the  slope  of  the  first  edge  of  the  shortest  path  as  the  smoothing  rate 
for  period  i.  The  selection  of  the  destination  point  is  designed  to  minimize  the  number  of 
rate  changes  over  time  by  choosing  the  rate  in  the  previous  period  and  keeping  it  as  con¬ 
stant  in  the  look-ahead  window. 

In  Figure  2.4,  a  description  of  the  shortest  path  algorithm  is  presented.  The 
smoothing  rate  is  defined  as  alpha  ■  min  +  (1  -alpha)  -max  where  min  is  the  rate  on  the 
lower  path  and  max  on  the  upper  path.  There  are  two  conditions  which  can  terminate  the 
algorithm:  either  a  new  cusp  is  found  or  the  destination  point  is  arrived  in  which  case 
both  paths  are  valid.  When  a  new  cusp  is  found,  the  shortest  path  belongs  to  the  other 
path  so  alpha  =  1  if  the  cusp  belongs  to  the  lower  path  and  vice  versa.  If  the  destination 
point  is  reached  (lower  path  connects  the  source  to  the  destination  point), 
alpha  =  1  (lower  path)  is  chosen.  In  the  next  section,  it  will  be  described  how  to  improve 
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int  fmd_shortest_path  (coordinate  *upper_bound,  coordinate  *Iower_bound,  int  H)  { 

int  up_index[H+l],  down_index[H+2];  /*  indexes  of  the  bounds  forming  each  path  */ 

int  num_up=2;  num_down=2;  /*  number  of  bounds  at  each  path  */ 

int  exit_flag=l ;  /*  exit  condition,  defaults  1  when  destination  can  be  reached  by  both 

paths,  exit_flag=0  when  a  new  vertex  is  found  in  the  look-ahead  window  */ 
double  min,  max;  /*  minimum  and  maximum  values  of  the  rate  which  satisfies  the  bounds  */ 

double  alpha;  /*  rate  =  alpha*min  +  (l-alpha)*max  */ 


up_index[0]  =  down_index[0]  =  0;  /*  source  is  the  vertex  */ 

up_index[l]  =  down_index[l]  =  1;  /*  first  step  of  the  shortest-path  algorithm  */ 

for(i=2;i<H+2;i++)  {  /*  general  step  */ 

for(k=num_down-l  ;k>-l  ,k— ) 

if(check_for_convex(k,  i,  0))  {  /*  check  whether  lower_bound[i]  can  be 

connected  to  lower_bound[down_index[k]]  */ 
down_index[k+l]=i;  num_down  =  k+2;  goto  upperjoop; 

/*  append  lower_bound[i]  to  lower  path  */ 

} 

for(k=l  ;k<num_up;k++)  /*  search  upper  path  for  a  new  vertex  */ 

if(check_for_bound(k,i,0))  {  /*  check  whether  upper_bound[up_index[k]] 

can  be  a  new  vertex  */ 

exit_flag=0;  alpha=0;  goto  finish;  /*  possible  modification  here  */ 

} 

upperjoop: 

ifl(i<H+l)  {  /*  destination  point  is  already  reached  if  i=H+2  by  lower  path  */ 

for(k=num_up- 1  ;k>- 1  ,k~) 

if(check_for_convex(k,  i,  1))  {  /*  check  whether  upper_bound[i]  can  be 

connected  to  upper_bound[upper_index[k]]  */ 
up_index[k+l]=i;  numup  =  k+2;  goto  finishjoop; 

/*  append  upper_bound[i]  to  upper  path  */ 


} 

for(k=l  ;k<num_down;k++)  /*  search  lower  path  for  a  new  vertex  */ 

if(check_for_bound(k,i,l))  {  /*  check  whether  lower_bound[down_index[k]] 

can  be  a  new  vertex  */ 

exit_flag=0;  alpha=l ;  goto  finish;  /*  possible  modification  here  */ 

> 

} 

finishjoop: 

> 

finish: 

min  =  (lower_bound[downindex[l]].y-lower_bound[0].y)/ 

(lower J>ound[down  index[l]].x-lower_bound[0].x); 
max  =  (upper_bound[upJndex[l  ]].y-upper_bound[0].y)/(upper_bound[upjndex[l  ]].x-upper_bound[0].x); 
if(exit_flag)  alpha  =  1 ;  /*  choose  the  minimum  rate  if  both  paths  are  valid  */ 

return  (alpha*min  +  (l-alpha)*max); 

} 


Figure  2.4:  Specification  of  the  shortest-path  algorithm, 
the  performance  of  the  algorithm  by  choosing  alpha  such  that  the  number  of  rate  changes 
is  minimized.  Since  the  algorithm  takes  constant  time  0(H)  at  every  step  and  H  is  usually 
chosen  to  be  a  small  number,  it  can  be  easily  implemented  in  real  time. 
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Although  the  sequence  of  upper  and  lower  bounds  can  be  pre-computed  ahead  of 
the  transmission  for  stored-data  applications,  in  an  actual  system  implementation,  the  al¬ 
gorithm  behavior  should  be  adaptive  in  order  to  respond  to  changes  in  the  buffer  and  de¬ 
lay  constraints  because  of  the  change  in  network  QoS  guarantees  and  application  re¬ 
quirements.  For  example,  in  fully-interactive  VOD  systems,  most  video  sequences  to  be 
transmitted  are  computed  in  real-time  to  support  operations  such  as  scanning  and  editing. 
For  networks  without  any  QoS  guarantees,  feedback  mechanisms  can  be  used  to  change 
the  parameters  of  the  smoothing  algorithm  to  provide  adaptive  behavior.  In  such  cases, 
causal  algorithm  should  be  used  since  future  traffic  is  not  deterministic  and  the  sequence 
of  upper  and  lower  bounds  must  be  re-computed. 

2.4  Experiments 

In  this  section,  the  experimental  results  of  the  smoothing  algorithm  are  provided  to 
show  that  the  proposed  scheme  is  effective  in  smoothing  bursty  traffic  under  the  given 
set  of  delay  and  buffer  constraints.  A  large  number  of  experiments  using  four  MPEG 
video  sequences  are  performed  to  evaluate  the  performance  of  the  causal  algorithm.  First 
the  effect  of  the  parameters  on  the  original  causal  algorithm  is  examined.  Then  an  im¬ 
provement  to  the  original  algorithm  is  described  to  further  decrease  the  number  of  rate 
changes.  In  order  to  justify  the  proposed  algorithm,  the  existing  smoothing  algorithms  in 
the  literature  are  summarized  and  two  of  the  promising  algorithms  are  applied  to  the  same 
set  of  MPEG  video  sequences.  The  results  indicate  that  the  proposed  method  provides 
better  rate  smoothing,  especially  for  bursty  traffic  with  rapidly  changing  picture  sizes. 
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Finally,  the  conditions  for  which  buffer  or  delay  constraint  should  be  used  when 
smoothing  prerecorded  source  traffic  are  discussed. 

2.4.1  Performance  of  Original  Causal  Algorithm 

Live-video  applications  require  buffering  at  the  server  for  smoothing  since  the 
server  cannot  send  faster  than  the  encoder  can  produce.  To  evaluate  the  performance  of 

the  causal  algorithm,  delay  constraints  D^ax  =  D  +  T  and  Dd  ax  =0  seconds  are  used 
with  no  buffer  constraints.  The  choice  of  client  or  server  for  buffering  has  no  effect  on 
the  performance  since  from  (2.3),  it  is  seen  that  as  long  as  Le  +  Ld  is  a  constant,  the 
smoothed  rate  functions  for  various  values  of  V  and  Ld  are  almost  identical  except  with 
a  phase  shift  in  time  because  of  the  fact  that  the  difference  between  a  lower  bound  and  its 
corresponding  upper  bound  is  always  the  same.  This  indicates  that  the  choice  of  a  delay 
value  at  the  server  or  client  depends  on  other  factors  such  as  cost  of  the  implementation 
and  availability  of  resources  rather  than  the  performance  of  the  smoothing  algorithm. 

Four  MPEG  video  bit  streams  are  used  in  the  experiments,  each  of  which  has 
40,000  frames  per  sequence  (about  22  minutes  of  video  for  30  frames/sec  frame  rate)  with 
encoded  picture  parameters  of  N  =  12  and  M  =  3  [12].  The  video  bit  streams  include 
two  movies,  “Star  Wars”  and  “Terminator  II”,  a  cartoon,  “Asterix”,  and  a  sports  event, 
“Formula  1  Race:  GP  Hockenheim  1994”.  Table  2.1  shows  the  compression  rates  and 
basic  statistics  of  the  video  sequences  used  in  the  experiments.  As  shown  in  Figure  2.5, 
Formula  1  contains  many  rapid  movements  and  scene  changes  so  its  trace  shows  very 
large  changes  in  any  type  of  frames,  and  the  B  frames  are  often  the  same  size  as  the  P 
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Table  2.1:  Statistics  of  the  MPEG  video  sequences  used  in  the  experiments. 


Sequence 

Comp. 

rate 

Frames 

Mean 

[bits] 

Frames 

CoV 

Frames 

Peak/ 

Mean 

GOPs 

Mean 

[bits] 

GOPs 

CoV 

GOPs 

Peak/ 

Mean 

Asterix 

119 

22,348 

0.90 

6.6 

268,282 

0.47 

4.0 

Formula  1 

86 

30,749 

0.69 

6.6 

369,006 

0.38 

3.6 

Star  Wars 

130 

15,599 

1.16 

11.9 

187,185 

0.39 

5.0 

Terminator  2 

243 

10,904 

0.93 

7.3 

130,865 

0.35 

0.74 

frames.  Star  Wars  and  Terminator  2  show  the  behavior  of  typical  movie  sequences  with 
high  compression  rates  because  of  the  slow  changing  scenes  compared  to  those  in  the 
sports  events. 

In  the  experiments,  the  size  of  picture  i,  if  not  known,  is  estimated  to  be  E;_N 
which  uses  the  fact  that  the  pictures  i  and  i-N  are  of  the  same  type  (I,  B  or  P).  Figure  2.6 
shows  original  bit  rate  as  a  function  of  time  for  the  first  200  pictures  of  Formula  1  video 
sequence  and  three  values  of  the  delay  bound  D  =  0.067,  0.1333  and  0.2  seconds.  The 
algorithm  is  run  with  parameter  values  of  H  =  6  and  K  =  1.  There  is  still  some  burstiness 
associated  with  the  smoothed  rate  for  both  causal  and  non-causal  algorithms  when  D  = 
0.067  second,  but  for  D  =  0.133  second,  the  non-causal  algorithm  gives  a  smoothed  rate 
function,  whereas  the  causal  algorithm  output  is  still  bursty  compared  to  that  of  non- 
causal  algorithm.  And  for  D  =  0.2  second,  both  algorithms  give  very  smooth  output  rate 
functions.  Therefore  D  =  0.2  second  is  an  optimal  parameter  value  to  use  for  causal  algo¬ 
rithm  if  a  delay  up  to  0.2  second  is  allowed,  and  D  =  0.133  second  is  already  an  excellent 
value  for  non-causal  algorithm. 
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Figure  2.5:  Two  MPEG  video  sequences:  Formula  1  and  Star  Wars. 

Figure  2.7  shows  the  delays  of  individual  pictures  for  two  comparisons.  In  the  up¬ 
per  graph,  three  delay  bounds  of  the  causal  algorithm  are  compared.  As  shown,  the  delays 
of  pictures  are  bounded  by  0.067,  0.133  and  0.2  second  as  specified  for  the  casual  algo¬ 
rithm.  No  delay  bound  violation  has  been  observed  in  any  of  the  experiments  since 
K  >  1  for  all  cases.  In  the  lower  graph,  the  picture  delays  of  causal  and  non-causal  algo¬ 
rithms  for  D  =  0.2  second  are  compared  It  is  observed  that  when  the  input  rate  increases 
rapidly,  the  causal  algorithm  underestimates  the  picture  sizes  resulting  in  larger  picture 
delays,  and  when  the  input  rate  decreases  rapidly,  the  picture  sizes  are  overestimated  re- 
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Rate  (Mbps)  Rate  (Mbps) 


Figure  2.6:  Rate  as  a  function  of  time  for  original  bit  stream  and  three  delay  bounds, 
suiting  in  smaller  picture  delays  compared  to  those  of  the  non-causal  algorithm. 

Different  quantitative  measures  can  be  defined  to  characterize  the  effectiveness  of 
smoothing  [13].  Three  of  them  are  used  to  study  algorithm  performance,  as  each  of  the 
parameters,  D,  H  and  K  varies.  The  measures  are: 


•  the  number  of  times  the  rate  function  is  changed  by  the  algorithm  over  N  peri¬ 
ods. 

•  the  maximum  value  of  the  rate  function  over  N  periods. 

•  the  standard  deviation  (S.D)  of  the  rate  function  over  N  periods. 
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Figure  2.7:  Delays  of  pictures  in  Formula  1  video  sequence. 


Figure  2.8  shows  the  three  quantitative  measures  as  a  function  of  delay  bound  for 
the  four  MPEG  video  sequences  when  the  causal  algorithm  is  applied  as  specified  in  Fig¬ 
ure  2.3.  As  it  is  expected,  when  delay  bound  is  increased,  the  rate  function  becomes 
smoother.  The  maximum  rate  decreases  rapidly  at  first,  and  then  stays  almost  the  same 
after  D  =  0.2  second  and  standard  deviation  of  rate  behaves  similarly  for  all  cases.  An 
interesting  observation  is  that  the  smoothing  effect  on  the  number  of  rate  changes  for 
Formula  1  is  larger  than  those  of  Star  Wars  and  Terminator  2  which  can  be  explained  as 
follows:  when  the  estimation  error  is  large,  the  rate  function  tends  to  oscillate  between 
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maximum  or  minimum  values  resulting  in  constant  rates  lasting  longer  whereas,  for 
other  cases,  many  rate  changes  are  needed  to  track  the  input  traffic.  That  is  why  the  stan¬ 
dard  deviation  of  rate  for  Formula  1  is  also  larger. 

Figure  2.9  shows  the  quantitative  measures  as  a  function  of  the  look-ahead  interval, 
H,  for  the  four  MPEG  video  sequences.  There  is  no  advantage  of  using  large  values  of  H 
since  the  number  of  rate  changes  increases  as  H  increases.  The  standard  deviation  of  rate 
and  the  maximum  rate  do  not  show  any  noticeable  improvement  for  values  of  II  larger 
than  5. 

K  should  be  as  small  as  possible  to  reduce  picture  delay  at  the  server.  The  experi¬ 
mental  results  in  Figure  2.10  show  that  K  does  not  effect  the  standard  deviation  of  rate 
and  the  maximum  rate  significantly,  but  the  number  of  rate  changes  is  a  decreasing  func¬ 
tion  of  K.  It  is  also  observed  that  for  K  >  5 ,  all  four  rate  functions  have  similar  number 
of  rate  changes  which  indicate  that  the  smoothing  algorithm  has  the  same  degree  of  effect 
on  MPEG  encoded  bit  streams  with  different  statistics  when  the  future  is  known 
(assuming  the  bit  streams  are  encoded  with  the  same  parameters,  e.g.,  M  and  N  are  the 
same).  A  common  result  of  all  three  set  of  experiments  is  that  number  of  rate  changes  is 
more  sensitive  to  estimation  error  than  the  other  two  measures. 
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Figure  2.10:  Performance  of  causal  algorithm  as  a  function  of  parameter  K. 


2.4.2  Improvement  of  Shortest  Path  Algorithm 


The  introduced  algorithm  gives  relatively  high  number  of  rate  changes  due  to  the 
estimation  error  of  the  future  picture  sizes.  From  (2.3),  it  can  be  seen  that  lower  bounds 
are  derived  using  the  size  of  pictures  in  the  past  and  upper  bounds  are  derived  using  the 
estimates  of  pictures  in  the  future.  Two  observations  can  be  made  from  the  experiments 
in  Section  2.4.1.  First,  lower  bounds  are  less  prone  to  the  estimation  error  than  the  upper 
bounds  due  to  higher  number  of  known  picture  sizes.  Second,  choosing  the  lower  path 
when  destination  point  is  reached,  does  not  guarantee  that  the  same  rate  can  be  chosen  in 
the  next  period,  instead  rate  should  be  chosen  between  the  maximum  and  minimum  val¬ 
ues  in  order  to  increase  the  possibility  that  it  falls  within  the  new  bounds  in  the  next  pe¬ 
riod.  The  effect  of  this  choice  would  be  an  increase  in  the  maximum  rate  and  the  standard 
deviation  of  rate.  The  following  modifications  to  the  specification  of  shortest  path  algo¬ 
rithm  in  Figure  2.4  decreases  the  number  of  rate  changes  significantly: 

1. When  a  new  cusp  is  found,  no  matter  to  which  path  it  belongs,  choose  alpha=l 
(always  choose  the  lower  path  or  the  minimum  rate). 

2.  If  destination  point  is  directly  connected  to  the  source  point,  which  corresponds 
to  the  case  of  rate  being  the  same  as  in  the  previous  period,  keep  it  the  same  (alpha=l), 
otherwise,  take  the  average  of  maximum  and  minimum  rates  (alpha=0.5). 

The  choice  of  alpha=0.5  provides  good  performance  in  the  average  although  an  op¬ 
timal  value  exists  for  each  video  sequence,  but  the  experiments  indicate  that  improvement 
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in  the  performance  is  not  significant  when  the  optimal  value  of  alpha  is  used,  so  al- 
pha=0.5  is  used  for  all  cases. 

2.4.3  Comparison  of  Smoothing  Algorithms 

The  performance  of  the  proposed  algorithm  can  be  judged  better  when  it  is  com¬ 
pared  with  respect  to  other  smoothing  algorithms  existing  in  the  literature.  Five  of  these 
algorithms  address  the  problem  of  smoothing  real-time  VBR  MPEG  video  with  delay 
constraint.  The  first  three  algorithms  have  been  described  in  [49]  which  are  designed  to 
minimize  the  maximum  rate  and  the  standard  deviation  of  rate  rather  than  number  of  rate 
changes,  whereas  the  last  two  are  similar  except  the  last  one  improving  on  the  previous 
one  in  the  number  of  rate  changes  [13,  48]. 

Algorithm  1  (Uniform  over  delay  interval)  The  data  for  each  picture  is  spread  out  for 
transmission  uniformly  across  D  picture  times,  so  that  the  data  for  that  particular  picture 
needs  to  be  sent  in  one  frame  time  is  Ef  /  (D  - 1) .  This  algorithm  gives  the  worst  per¬ 
formance  since  each  picture  is  smoothed  independently  from  the  others  not  taking  ad¬ 
vantage  of  I-B-P  pattern  of  MPEG  video. 

Algorithm  2  (Uniform  over  GOP  interval)  The  objective  is  to  attempt  to  achieve  uni¬ 
formity  of  the  traffic  over  the  entire  GOP,  to  prevent  bursts  in  the  traffic  profile  related  to 
occurrence  of  I  or  P  frames.  However,  excessive  delays  occur  if  all  frames  within  a  GOP 
are  buffered,  so  the  algorithm  finds  the  necessary  number  of  bits  that  should  be  transmit¬ 
ted  based  on  the  buffer  occupancy  and  delay  constraint.  This  algorithm  gives  the  best  per¬ 
formance  in  the  standard  deviation  of  rate,  but  suffers  from  high  number  of  rate  changes. 
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Algorithm  3  (Transmit  as  fast  as  possible  below  peak  rate)  The  basic  idea  is  to 
transmit  at  the  peak  rate  that  has  been  determined  to  be  needed,  so  long  the  bit  rate  is  less 
than  the  peak  and  buffer  is  empty.  This  approach  ensures  that  the  buffer  occupancy  just 
before  the  start  of  the  I-frame  is  as  small  as  possible,  so  that  the  minimum  rate  for  the  I- 
ffames  is  minimized.  In  fact,  this  algorithm  gives  the  highest  standard  deviation  since  the 
transmission  rate  drops  to  the  individual  picture  rates  after  the  encoder  buffer  is  empty. 
But,  minimum  peak  rate  is  achieved  compared  to  the  first  two  algorithms. 

Algorithm  4  The  algorithm  computes  a  lower  bound  for  the  transmission  rate  during 
each  period  such  that  the  delay  bound  D  is  satisfied.  Upper  bounds  for  the  rate  are  com¬ 
puted  to  ensure  continuous  transmission.  If  the  rate  during  the  previous  period  is  between 
the  upper  and  lower  bounds  of  the  current  frame,  the  rate  remains  the  same,  otherwise, 
the  rate  is  chosen  as  the  upper  or  lower  bound  of  the  current  frame  which  decreases  the 
number  of  rate  changes.  The  drawback  of  this  algorithm  is  the  high  number  of  rate 
changes,  which  are  small  but  frequent.  This  makes  it  difficult  for  network  scheduling. 

Algorithm  5  Another  algorithm  based  on  Algorithm  6  introduces  two  more  parame¬ 
ters  to  choose  the  rate  between  upper  and  lower  bounds  instead  of  the  exact  bounds  when 
a  rate  change  is  needed.  In  fact,  the  rate  changes  are  decreased,  in  some  cases,  by  a  factor 
of  four,  but  there  are  two  main  problems  with  this  algorithm.  First,  it  is  difficult  to  predict 
the  optimal  values  of  those  two  parameters  when  encoding  of  the  video  is  in  real-time  and 
significant  increases  in  the  maximum  rate  are  observed  especially  for  larger  delays. 

Algorithm  6  and  7  Original  and  improved  causal  algorithms  respectively. 
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Table  2.2:  Performance  of  7  algorithms  applied  to  Formula  1  and  Star  Wars 

MPEG  Sequences. 


Formula  1 

Star  Wars 

Algor. 

No 

No.  of  Rate 

Changes 

Maximum 

Rate  (Mbps) 

S.D.  of 

Rate  (Mbps) 

No.  of 

Rate 

Changes 

Maximum 

Rate  (Mbps) 

S.D  of 

Rate(Mbps) 

1 

39973 

4.34304 

0.38648 

39903 

1.42608 

0.2 

2 

38793 

3.34138 

0.35546 

30219 

1.29996 

0.17329 

3 

37440 

3.33158 

0.5046 

23198 

1.29966 

0.19 

4 

12306 

3.39466 

0.38912 

22046 

1.30709 

0.1916 

5 

5243 

4.27505 

0.394093 

20934 

1.547572 

0.190818 

6 

13167 

3.339203 

0.361009 

23589 

1.29519 

0.173892 

7 

4892 

3.445754 

0.368680 

18304 

1.395934 

0.180995 

Table  2.2  presents  the  results  of  applying  seven  algorithms  to  the  video  sequences 
of  Formula  1  and  Star  Wars  with  D=0.2  seconds,  H=12  and  K=l.  Algorithms  1-3  are  im¬ 
plemented  according  to  the  descriptions  given  in  [49]  and  Algorithms  4  and  5  using  the 
specifications  provided  in  [13]  and  [48]  respectively.  The  optimal  values  of  the  two  pa¬ 
rameters  needed  by  Algorithm  5  for  Formula  1  and  Star  Wars  sequences  are  used  as  pro¬ 
vided  in  [48].  Algorithm  2  gives  the  smoothest  rate  in  terms  of  standard  deviation.  Algo¬ 
rithm  7  provides  the  minimum  number  of  rate  changes  and  also  very  good  performance  in 
the  standard  deviation  of  rate  whereas,  Algorithms  2,  3,  and  6  provide  the  smallest  maxi¬ 
mum  rate.  The  results  indicate  that  Algorithm  7  performs  the  best  in  overall  with  the 
minimum  number  of  rate  changes  and  standard  deviation  of  rate  at  the  cost  of  a  minimal 
increase  in  the  maximum  rate. 

The  different  behavior  of  the  algorithms  can  be  explained  as  follows.  The  first  three 
algorithms  utilize  only  the  sizes  of  pictures  in  the  buffer  not  taking  advantage  of  future 
picture  sizes  resulting  in  relatively  high  number  of  rate  changes,  but  also  the  smoothest 
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rate.  Algorithms  4  and  5  utilize  the  size  of  pictures  in  the  future  to  derive  the  upper  and 
lower  bounds  by  using  simple  estimates,  which  results  in  relatively  worse  performance 
compared  to  6  and  7,  since  both  upper  and  lower  bounds  are  affected  by  the  estimation 
error.  On  the  other  hand,  the  proposed  algorithm  uses  the  sizes  of  pictures  in  the  past  to 
derive  the  lower  bounds  and  estimates  of  the  pictures  in  the  future  to  derive  the  upper 
bounds.  This  separation  provides  the  flexibility  to  control  the  behavior  of  the  algorithm, 
e.g.,  for  better  maximum  rate  and  standard  deviation,  lower  path  is  chosen  (Algorithm  6), 
for  minimum  number  of  rate  changes,  the  upper  bounds  are  utilized  (Algorithm  7). 

In  order  to  evaluate  the  performance  of  the  algorithms  as  a  function  of  delay  bound, 
experiments  using  Algorithms  4-7  and  non-causal  algorithm  which  corresponds  to  the 
ideal  smoothing  were  conducted.  The  results  are  given  in  Figures  2.11  and  2.12  for  For¬ 
mula  1  and  Star  Wars  video  sequences.  As  expected,  Algorithm  7  is  superior  in  the  num¬ 
ber  of  rate  changes  and  the  standard  deviation  of  rate  especially  after  DO.  15  seconds. 
The  problems  associated  with  Algorithm  5  can  be  seen  by  observing  the  significant  in¬ 
creases  in  the  peak  rate  and  the  standard  deviation  of  rate  for  both  video  sequences.  It 
should  be  noticed  that  Algorithm  6  allows  for  better  tracking  of  input  rate,  thus  better 
smoothing  with  smaller  maximum  rate  and  standard  deviation,  but  with  frequent  changes 
of  rate.  For  video  bit  streams  with  rapidly  changing  scenes  and  picture  sizes,  as  in  the 
case  of  Formula  1  trace,  both  Algorithm  6  and  7  are  better  choices  if  peak  rate  needs  to 
be  minimized  when  allocating  network  bandwidth.  The  effect  of  smoothing  on  Formula  1 
is  the  drastic  decrease  in  the  number  of  rate  changes  but  this  does  not  apply  to  the  case  of 
Star  Wars.  This  is  because  the  difference  between  the  upper  and  lower  bounds  is  large 
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when  input  traffic  is  changing  rapidly  with  large  estimation  error  margin  resulting  in  less 
number  of  rate  changes.  But  for  slowly  changing  traffic,  rate  prediction  is  heavily  af¬ 
fected  by  the  estimation  error  since  error  margin  is  smaller  although  when  delay  bound  is 
allowed  to  be  more  than  0.2  second,  the  difference  in  the  performance  between  ideal  and 
causal  algorithms  is  marginal. 
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Figure  2.11:  Performance  of  three  algorithms  as  a  function  of  delay  bound  for  Formula  1 
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Figure  2.12:  Performance  of  three  algorithms  as  a  function  of  delay  bound  for  Star  Wars. 
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2.4.4  Buffer  vs  Delay  Constraint 


The  issue  of  using  either  buffer  or  delay  constraint  can  be  addressed  when  the 
availability  of  resources  is  a  factor  in  the  design  of  the  system.  Although  either  of  the 
constraints  provides  effective  smoothing,  the  performance  of  the  smoothing  algorithm 
for  each  constraint  can  be  evaluated  using  the  definition  of  a  cost  function  as  in  the  fol¬ 
lowing: 


Total  Cost  = 


y  v  ( maxB^ 

lmPDiJ+ nr- 

V  C  avg  / 


(2.8) 


where  D,  is  the  delay  of  picture  i,  B;  is  the  amount  of  data  in  the  buffer  at  the  end  of  pe¬ 
riod  i,  Eavg  is  the  average  picture  size.  The  second  term  in  (2.8)  denotes  the  approximate 
number  of  pictures  queued  in  the  buffer. 

A  set  of  experiments  was  conducted  to  derive  the  cost  of  smoothing  as  a  function  of 
one  of  the  three  performance  measures  as  defined  in  Section  2.4.2.  The  ideal  smoothing 
algorithm  was  applied  to  frame  traces  of  Formula  1  and  Star  Wars  using  buffering  only  at 
the  client.  Figures  2.13  and  2.14  show  the  results  of  the  experiments  for  Formula  1  and 
Star  Wars  frame  traces.  Buffer  constraint  costs  less  when  the  number  of  rate  changes  is 
used  as  a  performance  measure  except  for  very  large  delay  bounds  or  buffer  sizes  where 
the  cost  is  almost  the  same  and  delay  constraint  costs  less  for  all  possible  delay  bounds 
and  buffer  sizes  when  maximum  rate  is  the  performance  measure.  In  the  case  of  buffer 
constraint,  the  difference  between  lower  and  upper  bounds  is  usually  larger  compared  to 
the  case  of  delay  constraint  resulting  in  longer  intervals  with  constant  rate,  but  also  with 
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larger  maximum  rate  since  the  delay  constraint  allows  for  better  tracking  of  input  rate. 
However,  when  standard  deviation  of  rate  is  the  performance  measure,  the  statistics  of  the 
trace  determines  which  constraint  is  more  effective.  In  the  case  Formula  1  trace,  delay 
constraint  costs  more,  whereas  for  Star  Wars  trace,  buffer  constraint  costs  more.  This  is 
due  to  the  fact  that  Formula  1  has  many  rapidly  changing  scenes  resulting  in  large  in¬ 
creases  or  decreases  in  picture  sizes  even  for  the  same  picture  type  (I,  B,  or  P)  which  can 
be  smoothed  more  effectively  when  the  difference  between  upper  and  lower  bounds  is 
sufficiently  large  or  almost  constant  as  in  the  case  of  buffer  constraint.  Star  Wars  trace 
includes  slowly  changing  scenes  and  is  smoother  (size  of  pictures  of  the  same  type  does 
not  change  rapidly  in  the  consecutive  scene)  compared  to  Formula  1  trace  so  delay  con¬ 
straint  is  more  effective  in  tracking  the  input  traffic. 
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2.5  Conclusion 

The  smoothing  of  video  traffic  plays  an  important  role  in  the  design  of  video 
communication  systems.  As  part  of  a  research  project  on  the  design  of  transport  and  net¬ 
work  protocols  for  multimedia  applications,  a  smoothing  algorithm  is  specified  and  de¬ 
signed  to  satisfy  a  given  set  of  buffer  and  delay  constraints.  A  large  set  of  experiments 
was  conducted  using  statistics  of  MPEG  video  sequences  to  study  the  performance  of  the 
algorithm.  It  has  been  shown  that  the  algorithm  is  effective  in  smoothing  VBR  video 
when  its  performance  is  compared  with  respect  to  other  techniques  existing  in  the  litera¬ 
ture.  The  design  of  the  algorithm  allows  the  users  to  choose  the  optimal  behavior  for  a 
given  performance  measure  including  the  number  of  rate  changes,  the  maximum  rate  and 
the  standard  deviation  of  rate.  This  feature  will  be  exploited  in  the  next  chapter  when  re¬ 
negotiating  bandwidth  with  the  network. 

The  proposed  algorithm  can  be  used  as  part  of  a  video  transport  system  where  the 
QoS  of  the  underlying  network  services  may  change  during  the  transmission.  For  net¬ 
works  with  no  QoS  guarantees,  the  algorithm  can  be  extended  to  include  varying  net¬ 
work  conditions  as  well.  Since  the  foundation  of  the  model  is  based  on  the  status  of  buff¬ 
ers  at  the  server  and  client,  the  algorithm  can  adapt  to  changing  network  conditions  by 
using  the  feedback  from  the  client  to  recompute  the  bounds  based  on  information  about 
the  buffer  occupancies  at  the  client  and  server. 
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Chapter  3 

Effects  of  Smoothing  on  End-to-end 
Deterministic  Guarantees  for  VBR 
Traffic 


3.1  Introduction 

Future  packet-switching  integrated  services  networks  must  provide  support  for  dis¬ 
tributed  multimedia  applications  with  stringent  delay  and  loss  requirements  in  terms  of 
network  performance.  Of  the  many  traffic  classes  in  integrated  services  networks,  delay 
and  loss  sensitive  VBR  video  traffic  will  be  generated  by  most  distributed  multimedia 
applications.  For  bursty  VBR  traffic,  it  is  generally  difficult  to  provide  the  good  QoS  that 
the  network  clients  specify  for,  and  to  simultaneously  achieve  high  network  utilization. 
There  has  been  many  research  on  the  issue  of  providing  services  with  different  degree  of 
QoS  guarantees  including  deterministic,  statistical  and  best-effort  services.  Among  these, 
deterministic  service  guarantees  that  all  packets  of  a  connection  will  meet  the  promised 
QoS,  whereas  with  statistical  service,  only  probabilistic  performance  bounds  are  guaran¬ 
teed.  These  two  services  can  be  viewed  as  a  tradeoff  between  the  QoS  of  the  connection 
and  higher  network  utilization;  since  statistical  multiplexing  results  in  overallocating  of 
network  resources.  However,  deterministic  performance  guarantees  are  especially  impor- 
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tant  for  time-critical  and  loss-sensitive  data  given  recent  studies  on  the  effect  of  cell  loss 
on  the  perceptual  quality  of  video  [68].  Contrary  to  conventional  wisdom,  deterministic 
service  does  not  require  a  peak-rate-allocation  scheme  and  reasonable  network  utilization 
can  be  achieved  even  while  providing  worst-case  guarantees  through  better  traffic  models 
such  as  Deterministic  Bounding  Interval  Dependent  (D-BIND)  traffic  model  [69],  and 
with  more  accurate  admission  control  schemes  as  in  [70-73]. 

An  important  issue  in  providing  end-to-end  performance  guarantees  is  the  effect  of 
smoothing  bursty  traffic  on  network  utilization  and  QoS.  In  Chapter  3,  the  effect  of 
smoothing  on  the  specification  of  UPC  for  ATM  networks  is  investigated  and  it  is  found 
that  with  smoothing  of  bursty  traffic,  the  cost  of  transmission  can  be  reduced  by  allocat¬ 
ing  less  amount  of  network  resources  than  the  case  for  unsmoothed  traffic.  While  reduc¬ 
ing  the  burstiness  of  VBR  sources  through  traffic  smoothing  may  help  users  reduce  their 
cost  and  at  the  same  time  increase  network  utilization,  it  also  introduces  an  increase  in  the 
end-to-end  delay  by  contributing  extra  delay  at  the  network  client  buffer.  In  this  thesis, 
the  type  of  smoothing  in  which  bursts  are  spread  over  time  by  adding  variable  delay  to 
packets  is  considered  instead  of  reducing  source’s  bandwidth  or  dropping  packets  during 
bursts  both  of  which  deteriorate  the  perceptual  quality  of  the  video  [33,  68],  The 
smoothing  scheme  proposed  in  Chapter  2  is  used  which  provides  a  bounded  delay  at  the 
buffer  for  each  packet  sent  to  the  network. 

The  input  traffic  is  characterized  by  D-BIND  traffic  model  using  bounding  rates 
over  multiple  interval  lengths  which  allows  for  a  higher  network  utilization  by  providing 
a  more  accurate  traffic  specification  [69].  With  the  D-BIND  traffic  model,  it  is  possible  to 
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define  the  relative  burstiness  of  input  traffic  in  a  manner  similar  to  [74].  When  the 
smoothed,  less  bursty  traffic  sources  are  multiplexed  at  queues  inside  the  network,  the 
resulting  bound  on  queuing  delay  will  be  reduced.  This  allows  for  more  admissible  con¬ 
nections  or  better  QoS  support  given  the  same  number  of  connections.  However,  the  extra 
smoothing  delay  must  be  accounted  for  when  considering  the  total  end-to-end  delay 
bound.  Thus,  in  this  chapter,  the  effect  of  smoothing  VBR  video  on  the  end-to-end  delay 
bound  which  consists  of  smoothing  delay  at  the  client  source  buffer  and  queuing  delay 
inside  the  network  is  investigated.  Since  smoothing  decreases  the  bound  on  queuing  de¬ 
lay,  but  increases  the  bound  on  smoothing  delay,  it  can  be  considered  as  a  tradeoff  be¬ 
tween  buffering  at  the  source  and  buffering  inside  the  network.  For  the  case  of  ideal 
smoothing  where  future  traffic  is  known,  it  is  shown  both  analytically  and  empirically 
that  the  extra  delay  contributed  by  the  smoothing  of  a  source  is  equal  to  the  gain  in 
queuing  delay  when  multiplexing  smoothed  sources  over  a  congested  hop  with  homoge¬ 
nous  sources  resulting  in  non-negative  savings  in  the  end-to-end  delay  bound.  In  a  simi¬ 
lar  work,  smoothing  over  a  single  hop  has  been  found  ineffective  due  to  the  traffic  shap¬ 
ing  implemented  by  a  FIFO  which  services  packets  at  a  smoothing  rate  Rs  where  Rs  is 
less  than  the  unsmoothed  source’s  peak  rate  and  greater  than  its  long  term  average  rate 
[43].  In  contrast,  results  presented  in  this  thesis  indicate  that,  with  ideal  smoothing,  it  is 
possible  to  achieve  higher  network  utilization  without  any  degradation  in  the  QoS  of  the 
connection  even  for  a  single  hop.  Alternatively,  for  multiple  congested  nodes,  smoothing 
results  in  significant  reductions  in  the  end-to-end  delay  bound  since  sum  of  the  savings  in 
queuing  delay  at  each  congested  hop  is  more  than  the  incurred  extra  smoothing  delay  at 
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the  source.  Thus,  a  higher  network  utilization  or  lower  end-to-end  delay  can  often  be 
achieved  by  smoothing.  This  result  also  applies  to  real-time  traffic  which  can  not  be  as 
efficiently  smoothed  as  stored  video.  The  experiments  indicate  that  smoothing  of  real¬ 
time  traffic  is  beneficial  only  when  there  exists  a  minimum  number  of  congested  hops  in 
the  path  of  the  connection  and  when  traffic  is  smoothed  over  at  least  four  frame  periods 
as  smoothing  has  been  shown  to  be  most  effective  for  delays  over  three  frame  periods  in 
Chapter  2  and  3. 

In  the  recent  literature,  traffic  shaping  or  smoothing  has  received  much  attention. 
For  example,  in  [43],  authors  use  similar  techniques  to  those  used  here  to  determine  the 
conditions  for  which  smoothing  will  result  in  a  net  reduction  in  end-to-end  delay  bound 
when  traffic  shaping  is  realized  by  a  FIFO  buffer  with  constant  service  rate.  In  another 
similar  work,  the  authors  show  that  end-to-end  delays  with  rate  controlled  services  are 
strictly  less  than  with  Rate  Proportional  Processor  Sharing  [45]. 

In  a  related  work  on  deterministic  guarantees,  a  traffic  shaper  which  affects  peak 
rate  or  cell  spacing  has  been  considered  and  for  a  single  hop,  it  was  argued  that  signifi¬ 
cant  utilization  gains  are  possible  [46],  however  this  gain  is  overstated  since  a  peak-rate- 
allocation  is  assumed  which  was  shown  to  be  an  unnecessary  condition  for  providing  de¬ 
terministic  service  [71,  73].  Several  other  work  consider  the  case  of  traffic  shaping  in 
terms  of  statistical  performance  guarantees  as  in  [47].  Another  work  considers  determi¬ 
nistic  smoothing  of  MPEG  traffic  sources  [75]  then  provides  statistical  guarantees  using 
histogram  techniques  introduced  in  [76].  In  this  chapter,  the  deterministic  approach  that 
does  not  allow  the  traffic  shaper  to  discard  packets  is  considered. 
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This  chapter  is  organized  as  follows.  In  Sections  3.2  and  3.3,  the  D-BIND  traffic 
model  and  its  associated  admission  control  test  are  reviewed  for  a  First  Come  First  Serve 
(FCFS)  scheduler  with  no-loss,  no-delay-violation  deterministic  guarantees.  In  Section 
3.4,  it  is  proven  analytically  that  for  a  single  congested  hop,  when  ideal  smoothing  is  ap¬ 
plied  to  VBR  sources,  the  reduction  in  queuing  delay  bound  is  equal  to  the  additional 
smoothing  delay  when  all  sources  have  the  same  smoothing  delay  bound.  In  Section  3.5, 
the  experimental  results  are  presented  for  the  traces  of  MPEG  video  to  show  the  effec¬ 
tiveness  of  smoothing  and  tradeoffs  for  both  stored  and  real-time  video.  Finally,  Section 
3.6  gives  some  concluding  remarks. 

3.2  The  Deterministic  D-BIND  Model 

Compared  to  statistical  service,  deterministic  service  provides  better  QoS  in  the 
sense  that  it  provides  no-loss  no-delay  service.  For  the  network  to  provide  such  service,  a 
deterministic  upper  bound  is  required  on  all  sources  receiving  this  service.  This  allows 
enforcement  of  source’s  traffic  specification  by  using  schemes  like  leaky-bucket  policer. 
On  the  other  hand,  statistical  models  of  the  source  are  much  more  difficult  to  enforce. 

The  Deterministic  Bounding  Interval  Dependent  (D-BIND)  has  been  introduced  in 
[69]  to  capture  the  property  that  sources  exhibit  burstiness  over  a  wide  variety  of  interval 
lengths.  The  key  components  of  D-BIND  model  are  that  it  is  bounding,  required  to  pro¬ 
vide  deterministic  QoS  guarantees,  and  interval-dependent,  needed  to  capture  the  bursti¬ 
ness  properties  of  sources.  With  respect  to  other  models  introduced  in  [77],  this  more  ac- 


62 


Figure  3.1:  D-BIND  rate-interval  pairs  for  a  segment  of  Goldeneye  Movie, 
curate  traffic  characterization  allows  for  a  higher  network  utilization  for  a  given  delay 
bound. 

D-BIND  traffic  model  defines  a  traffic  constraint  function  b(t)  which  constraints 
or  bounds  the  source  over  every  interval  of  length  t.  Denoting  A\tx,t2]  a  connection’s 
arrivals  in  the  interval  [tx,t2],  b(t )  requires  that  A[s,s  +  t]<b(t),  \/s,t>  0.  The  D- 
BIND  model  is  defined  via  rate-interval  pairs  {(Rk,Ik)\  k  =  1,2 .  The  constraint 
function  is  then  defined  as  a  piece-wise  linear  function 


m= 


Rk  Ik  Rk- 1  Ik- 1 

Ik  ~  I k-\ 


Ik  )  +  Rk  Ik  » 


(3.1) 


with  b(0)  =  0.  Thus  the  rates  Rk  are  an  upper  bound  on  the  rate  over  every  interval  of 
length  Ik  so  that 


A[t,t  +  Ik ] 


<Rk  Vt  >  0,  k  = 


(3.2) 
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Figure  3.2:  D-BIND  constraint  function  for  Goldeneye  sequence. 

In  Figure  3.1,  a  plot  of  the  D-BIND  rate-interval  pairs  for  a  40,000  frame  trace  of 
MPEG  compressed  James  Bond:  Goldeneye  movie  is  shown.  It  can  be  observed  that,  for 
small  interval  lengths,  Rk  approaches  the  source’s  peak  rate  while  for  longer  interval 
lengths,  it  approaches  the  long  term  average  rate  for  the  original  traffic.  The  effect  of 
smoothing  is  the  reduction  in  Rk  for  small  interval  lengths  leading  to  almost  constant 
value  of  Rk  for  all  interval  lengths  when  smoothing  delay  is  increased. 

Figure  3.2  shows  the  D-BIND  constraint  function  b(t)  described  by  (3.1)  for  the 
same  movie.  It  can  be  seen  that  the  D-BIND  model  captures  the  temporal  properties  of 
the  MPEG  video.  For  example,  the  peak  rate  of  the  original  unsmoothed  traffic  is  caused 
by  the  largest  1-frame  of  the  sequence  giving  the  initial  slope.  Next,  the  slope  decreases 
with  the  transmission  of  a  P-frame  which  is  usually  smaller  than  an  I-frame.  The  effect 
of  smoothing  on  the  D-BIND  constraint  function  is  the  smaller  amount  of  traffic  within 
an  interval  and  slope  becoming  constant  with  increasing  smoothing  delay. 
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Figure  3.3:  Deterministic  Admission  Control. 

3.3  Connection  Admission  Control 

Deterministic  admission  controls  rely  on  the  delay  analysis  techniques  of  [71,  73] 
which  are  illustrated  in  Figure  3.3.  The  upper  curve  represents  the  total  number  of  cumu¬ 
lative  bits  that  have  arrived  from  all  sources  into  the  queue  by  time  t  and  the  lower  curve 
represents  the  total  number  of  bits  transmitted  by  time  t.  The  difference  between  the  two 
curves  is  the  number  of  bits  currently  in  the  queue,  or  the  backlog  function.  When  the 
backlog  function  is  zero,  there  are  no  bits  in  the  queue  and  thus  a  busy  period  has  ended. 
If  the  upper  curve  is  a  deterministic  bounding  curve  then  the  maximum  delay  can  be  ex¬ 
pressed  as  a  function  of  two  curves.  For  example,  the  maximum  backlog  divided  by  the 
link  speed  provides  an  upper  bound  on  delay  for  a  FCFS  scheduler  [72]. 

The  constraint  function  provides  the  required  bound  on  arrivals  in  any  interval  of 
length  t,  so  that  with  the  aggregate  of  individual  source’s  respective  b(t)  constraint  func¬ 
tions  forming  the  upper  curve  of  Figure  3.3,  admission  control  conditions  for  determinis- 
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tic  delay  and  throughput  may  be  derived.  For  example,  for  a  FCFS  scheduler  with  j=l,  2, 
. . .  n  multiplexed  connections  constrained  by  their  respective  constraints  by  (t) ,  and  with 
a  link  speed  /,  and  a  maximum  packet  size  s  ,  a  deterministic  upper  bound  on  delay  for  all 
connections  is  given  by 


d  =  -t-sj  (3.3) 

The  proof  is  given  in  Theorem  1  of  [73].  In  the  homogenous  case,  the  maximum 
number  of  connections  that  can  be  multiplexed  for  a  delay  bound  d  is  therefore  given  by 

N(d)  =  maxj«  |  yma x{nb(t)-lt  +  s}  <  (3.4) 


3.4  Effect  of  Smoothing  on  the  Deterministic 
Service 

Smoothing  of  a  VBR  traffic  from  D-BIND  model  point  of  view  is  the  transforma¬ 
tion  of  a  source  with  upper  bounds  {(Rk,Ik)  |  k  =  1,2,---,P}  to  {(£*,/*)!  k  =  1,2  ,---,P} , 
with  Rk  <  Rk  if  Ik  =  Ik  as  it  can  be  seen  in  Figure  3.1.  A  second  view  is  the  transfor¬ 
mation  of  source’s  D-BIND  constraint  function  b(t)  to  a  new  constraint  function  b(t ) : 


b(t)  = 


l±  Rk- 1  Ik- i 

h-h-i 


Ik  )  +  Rk  Ik  ’  Ik- 1  —  Ik 


(3.5) 


with  h(t)  <  b(t)  Vt  from  Equations  (3.1)  and  (3.5)  when  Rk  <Rk.  In  this  chapter,  the 
formal  definition  of  smoother  traffic  is  used  in  the  following  manner  similar  to  those  in 
[43,  74]: 
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Definition  3.1  If  =  lim^-^,  then  ,4(0  is  considered  smoother  or  less  bursty 

<~>M  f  <->00  t 

than  A(t)  if  b(t)  <  b(t)  Vt . 

An  example  of  the  transformation  of  smoothing  on  a  source’s  constraint  function 
b(t)  can  be  seen  in  Figure  3.2  for  two  smoothing  delay  values  of  1  and  2  frame  periods. 

From  network  utilization  point  of  view,  less  bursty  sources  lead  to  better  network 
utilization  which  is  stated  in  the  following  Lemma  [43]: 

A 

Lemma  3.1  If  a  source  j  is  smoothed  so  that  the  arrival  process  A(t)  is  less  bursty  than 
A(t) ,  then  the  queuing  delay  bound  for  a  FCFS  scheduler  is  reduced. 

The  proof  of  Lemma  3.1  is  obvious  by  using  Equations  (3.1)  and  (3.3)  and  the  fact  that 

/V 

bj  ( t )  <  bj  (t)  \/t .  However,  while  smoothing  reduces  the  queuing  delay  bound,  it  also 
introduces  an  additional  delay  due  to  buffering  at  the  client.  The  worst  case  smoothing 
delay  has  been  defined  in  [71]  as  the  maximum  horizontal  time  distance  between  the  two 

curves  b(t)  and  b(t )  .  That  is, 

As  =max[t2-tl\b(t2)  =  b(tl))  (3.6) 

ls>,\ 

The  worst  case  smoothing  delay  contributes  to  the  total  end-to-end  delay  perceived  by  the 
source.  A  source  should  be  smoothed  if  the  additional  delay  bound  is  less  than  the  reduc¬ 
tion  in  the  queuing  delay  bound.  Below,  the  conditions  under  which  a  source  should  be 
smoothed  are  derived  when  ideal  smoothing  function  is  used  over  a  single  hop.  But  first, 
the  following  notations  are  defined:  d  represents  the  queuing  delay  bound  for  an 
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A 

unsmoothed  source,  d  represents  the  queuing  delay  bound  for  the  smoothed  source  and 
As  represents  the  worst  case  smoothing  delay.  The  following  Lemma  will  be  used  in  the 
proof  of  Theorem  3.1. 

Lemma  3.2  If  ideal  smoothing  function  with  maximum  smoothing  delay  of  AJ  is  ap¬ 
plied  to  A(t) ,  then  A(t)  <  A(t  +  AJ  Vt  or  equivalently  b(t )  <  b(t  +  As)  Vt . 

Proof:  When  ideal  smoothing  function  is  used  with  only  delay  constraint  at  the  client 
buffer,  the  upper  bound  represents  the  original  input  traffic,  whereas  the  lower  bound  rep¬ 
resents  input  traffic  with  a  time-shift  of  As  corresponding  to  the  worst  case  smoothing 
delay.  Since,  the  ideal  path  crosses  between  the  upper  and  lower  bounds,  in  the  worst 
case,  A(t)  =  A(t  +  As) ,  corresponding  to  the  case  when  the  shortest  path  passes  through  a 
lower  bound.  Then  A(t)  <  A(t  +  As)  Vf  for  the  general  case. 

The  ideal  smoothing  function  uses  the  delay  to  its  possible  extent,  since  it  searches 
for  the  shortest  path  through  the  bounds.  This  implies  that  when  a  large  burst  arrives  at 

time  tk ,  the  smoothing  function  guarantees  that  A{tk  )  =  A(tk  +  As)  since  the  minimum 
slope  of  the  shortest  path  is  obtained  by  crossing  through  the  lower  bound  rather  than  the 
upper  bound  which  would  mean  a  rapid  increase  in  the  slope. 

3.4.1  The  Single  Hop  Case 

The  following  theorem  shows  that  for  a  single-hop  network  and  homogenous 
sources,  ideal  smoothing  does  result  in  zero  increase  in  end-to-end  delay  bound. 
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Figure  3.4:  Effect  of  smoothing  on  network  queuing  delay  bound. 

Theorem  3.1  In  the  single-hop  case  for  homogenous  sources  and  deterministic  delay 

A 

bounds,  the  reduction  in  the  queuing  delay  bound,  d-d ,  introduced  by  ideal  smoothing 
function  is  equal  to  the  worst  case  smoothing  delay,  As  when  d  >  As . 

Proof: 


For  N  homogenous  sources  described  by  the  D-BIND  traffic  model  parameters 
(i?t,/A)f=1,  and  a  FCFS  scheduler  with  link  speed  /,  the  queuing  delay  bound  for  all 
unsmoothed  sources  can  be  obtained  using  Equation  (3.3): 


d  =  -  max(  Nb(I ,) -II ,  } . 

/  i <,j<,py  J  J  > 


(NRaIa-lIa) 

If  this  delay  bound  occurs  at  the  interval  length  Ia ,  then  d  = - - -  is  ob- 


-  (N  R  I  —II.) 

tained.  Let  the  delay  bound  for  smoothed  sources  be  d  = - - -  as  shown  in 
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Figure  3.4.  Using  Lemma  3.2,  b(Ia)<b(Ia  +  A s).  Since  all  bits  in  the  interval  Ja  are 
sent  in  the  interval  Ia  +  As  for  the  worst  case,  b(Ia )  =  b(Ia  +  As)  =  b(Is) is  obtained. 
This  corresponds  to  b(Ia)  =  RaIa  =  b(Is)  =  RSIS .  By  substituting  R0  Ia  with  RSIS  and 
Is  with  /a  +  As  in  Equation  (3.3)  for  d,  d  =  d-As  is  obtained  when  d>  As  since 
d  can  not  be  negative.  So  this  completes  the  proof. 

An  immediate  implication  of  Theorem  3.1  is  that  when  ideal  smoothing  function  is 
used,  higher  network  utilization  can  be  obtained  over  a  single  congested  hop  without  af¬ 
fecting  end-to-end  delay  bound  of  the  connection.  This  result  is  particularly  important 
since  it  shows  there  exists  a  smoothing  function  which  can  be  used  to  increase  network 
utilization  without  any  degradation  in  the  QoS  provided  to  the  smoothed  sources.  Using 
other  smoothing  schemes  has  been  found  to  be  ineffective  when  there  exists  only  a  single 
congested  hop  along  the  path  [43,  44]. 

3.4.2  The  Multi-Hop  Case 

Theorem  3.1  showed  that  over  a  single  hop,  ideal  smoothing  function  is  not  an  ef¬ 
fective  means  for  achieving  better  QoS  since  net  saving  in  its  end-to-end  delay  bound  is 
zero.  However,  over  multiple  hops,  queuing  delays  may  be  incurred  at  more  than  one 
node,  while  the  smoothing  delay  is  incurred  only  once  at  the  source’s  traffic  shaper.  Thus, 
a  smoother  source  can  reduce  its  queuing  delay  at  each  congested  hop  resulting  in  a  net 
saving  in  its  end-to-end  delay  bound.  The  effectiveness  of  smoothing  will  depend  on  the 
network  load,  the  number  of  hops  traversed,  the  burstiness  of  the  stream,  and  the  desired 
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delay  bound.  The  following  proposition  is  given  in  [43]  which  provides  a  rule  for  deter¬ 
mining  if  smoothing  provides  a  net  advantage  in  terms  of  end-to-end  delay  bound  for 
networks  that  use  rate  controlled  service  disciplines  such  as  Rate  Controlled  Static  Prior¬ 
ity,  Earliest  Deadline  First,  or  Hierarchical  Round  Robin  [78].  Rate-controlled  service 
schemes  reshape  the  traffic  at  each  hop  by  using  leaky  buckets  at  each  node  rather  than 
only  at  the  entrance  of  the  network. 

Proposition  3.1  If  di  is  the  queuing  delay  bound  at  hop  i  for  the  smoothed  source  and 
dt  is  the  original  queuing  delay  bound  at  hop  i,  a  source  will  obtain  a  net  reduction  in 
end-to-end  delay  bound  due  to  smoothing  if  the  following  condition  holds: 

a,  s (3-7) 

i=l 

where  H  is  the  number  of  hops  between  the  source  and  destination  for  networks  using  a 
rate-controlled  service  discipline. 

The  proof  of  Proposition  3.1  is  given  in  [43].  Theorem  3.2  provides  the  minimum 
number  of  congested  hops  in  order  to  obtain  a  net  reduction  in  end-to-end  delay  bound 
when  ideal  smoothing  function  is  used. 

Theorem  3.2  In  the  multiple-hop  network  with  homogeneous  sources  and  determinis¬ 
tic  delay  bounds,  a  source  will  obtain  a  net  reduction  in  end-to-end  delay  bound  due  to 
ideal  smoothing  if  the  number  of  congested  hops  in  the  path  is  at  least  two. 
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Figure  3.5:  Network  topology  used  in  the  experiments. 

Proof:  The  net  reduction  in  end-to-end  delay  bound  is  zero  for  the  single-hop  case  so 

A s=dx-dx.  If  there  is  more  congested  node  in  the  path,  then  it  is  obvious  that 

2 

AJ  <  'Ya(dt  -  dt)  since  d2-d2  >  0 .  So  this  completes  the  proof. 

i=i 


3.5  Experimental  Results 

A  segment  of  Formula  1  Race  MPEG  compressed  video  is  used  in  order  to  verify 
the  results  of  Theorem  3.1  and  3.2.  The  trace  includes  10,000  frames  corresponding  to  a 
total  running  time  of  7  minutes  at  a  24  frames/sec  display  rate.  Only  network  services 
with  deterministic  QoS  guarantees  are  considered  where  the  network  provides  a  no-loss, 
no-delay  violation  guarantees.  The  experiments  consider  both  the  single  and  the  multiple 
hop  cases  with  the  network  topology  shown  in  Figure  3.5.  N  connections  are  smoothed 
and  then  they  are  multiplexed  at  the  network  nodes.  The  packets  of  a  connection  traverse 
//hops  until  they  reach  their  destination. 

The  experiments  are  conducted  as  follows.  First  the  MPEG  trace  is  used  to  calcu¬ 
late  the  source’s  D-BIND  parameters  under  the  various  smoothing  delays  each  of  which 
expressed  by  the  parameter  S,  the  number  of  frames  smoothed  over.  Then,  the  admission 
control  test  of  Equation  (3.3)  is  applied  to  determine  the  maximum  number  of  homoge- 
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Figure  3.6:  Average  utilization  of  the  multiplexer  for  various  smoothing  delays. 

nous  admissible  connections  on  the  155  Mbps  link  for  a  given  queuing  delay  bound.  The 
following  performance  measures  are  used  to  evaluate  the  effectiveness  of  the  smoothing 
function:  average  utilization  of  the  network  which  is  calculated  as  N  ■  M/ 155  •  106  where 
M  is  the  long  term  average  bit  rate  of  the  source  in  bits/sec;  the  net  end-to-end  savings  in 

delay  bound  from  smoothing  connection  j,  Dj  - D}  -  ASJ  +  where 

Dj  and  ZL  correspond  to  the  total  end-to-end  delay  bound  for  original  and  smoothed 

source  j;  and  the  total  end-to-end  delay  bound  Dy  =  Zj^ij  +  ^sj  ■ 

3.5.1  Ideal  Smoothing  (Stored  Video) 

In  the  first  set  of  experiments,  the  average  utilization  of  the  multiplexer  is  used  as 
the  performance  index.  Figure  3.6  shows  the  effect  of  smoothing  on  the  queuing  delay 
bound,  that  is  the  delay  experienced  at  a  network  node.  The  general  shape  of  the  curves 
indicates  that  as  the  queuing  delay  bound  increases,  more  connections  are  admissible  so 
that  a  higher  utilization  is  possible.  As  stated  in  Lemmas  3.1  and  3.2,  when  the  smoothing 
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(a) 


(b) 

Figure  3.7:  Average  utilization  as  a  function  of  total  end-to-end  delay  bound 
(a)  for  the  single  hop  and  (b)  for  three  hops. 

delay  increases  (from  0  to  4  in  this  case),  the  traffic  transmitted  to  the  network  becomes 
smoother  so  that  the  queuing  delay  bound  is  reduced  or  equivalently,  for  a  given  queuing 
delay  bound,  higher  utilization  is  achievable  in  the  network. 

When  the  total  end-to-end  delay  bound  is  considered  including  both  smoothing  and 
queuing  delays,  the  average  utilization  is  the  same  for  all  smoothing  delays  for  the  single- 
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Figure  3.8:  Effect  of  the  number  of  hops  on  the  savings  in  total  delay  bound. 

hop  case  as  shown  in  Figure  3.7(a).  As  stated  in  Theorem  3.1,  there  is  no  increase  in  the 
total  end-to-end  delay  bound  for  the  single  hop  case  when  the  queuing  delay  is  larger  than 
smoothing  delay  of  the  source.  Figure  3.7(b)  shows  the  same  experiment  except  that  the 
sources  traverse  three  hops  rather  than  one.  In  this  case,  smoothing  results  in  a  substan¬ 
tial  reduction  in  a  source’s  end-to-end  delay  bound  which  was  explained  by  Theorem  3.2: 
if  there  is  more  than  one  congested  node  in  the  path,  end-to-end  delay  bound  is  reduced. 
Or  equivalently,  for  a  given  end-to-end  delay  bound,  more  connections  can  be  admitted  to 
the  network. 

Figure  3.8  shows  the  effect  of  the  number  of  hops  on  the  savings  in  end-to-end  de¬ 
lay  bound  for  a  fixed  number  of  connections  so  that  the  average  network  utilization  is 

32.23%  in  all  cases.  In  the  homogenous  case,  D-D  reduces  to  H(dj  -dj)-As  so  the 
savings  in  delay  bound  increases  linearly  with  the  number  of  hops.  As  predicted  by  Theo¬ 
rem  3.2,  the  lines  start  at  zero  for  a  single  hop,  then  become  positive  with  the  two  hops. 
With  larger  smoothing  delays,  more  gain  is  obtained  as  expected. 
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Figure  3.9:  Effect  of  network  load. 


Figure  3.9  shows  the  savings  in  end-to-end  delay  bound  as  a  function  of  network 
utilization.  The  smoothing  delay  is  fixed  to  S  =  2.  As  it  can  be  observed,  when  the  net¬ 
work  is  lightly  loaded,  smoothing  does  result  in  a  negative  delay  saving  for  both  one-hop 
and  three-hops  cases.  From  Theorem  3.1,  it  is  known  that  there  is  a  non-negative  saving 
in  total  delay  only  if  the  smoothing  delay  is  smaller  than  the  queuing  delay  bound  at  a 
given  network  utilization  for  unsmoothed  sources,  namely  d>  As.  As  long  as  the  net¬ 
work  load  is  such  that  d  is  less  than  As ,  the  net  saving  is  negative  for  the  one-hop  case. 
When  d  exceeds  As ,  savings  are  non-negative  for  both  one-hop  and  three-hops  cases. 
Notice  that  the  three-hops  case  starts  obtaining  non-negative  savings  at  a  smaller  network 
utilization  than  one-hop  case  due  to  the  extra  savings  at  two  more  congested  hops. 

Figure  3.10  shows  the  effect  of  the  smoothing  interval  (number  of  frames  smoothed 

over)  on  savings  in  end-to-end  delay  D-D  for  the  one-hop  case  at  18.8%  and  33.2% 
network  utilization  levels.  At  18.8%  utilization  level  ,  smoothing  is  not  effective  at  all, 
whereas  for  33.2%  utilization  level,  smoothing  delays  of  up  to  7  frame  periods  gives  zero 
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Effect  of  Smoothing  Delay  (One  Hop) 


Figure  3.10:  Effect  of  smoothing  delay  for  one-hop  network. 


net  gain  while  for  longer  smoothing  delays,  the  queuing  delay  bound  of  291.67  msecs  is 
exceeded  for  33.2%  utilization  level.  The  results  of  the  same  experiments  conducted  for 
the  three-hops  case  are  shown  in  Figure  3.11.  As  in  the  case  of  one-hop,  there  is  no  posi¬ 
tive  saving  at  18.8%  utilization  level.  At  33.3%  utilization  level,  positive  savings  axe 
achieved  and  smoothing  delay  of  7  frame  periods  provides  the  maximum  saving.  As  ob¬ 
served  in  Figure  3.8,  savings  increase  with  larger  smoothing  delays  which  explains  the 
positive  slope  before  the  peak  occurs.  However,  the  saving  starts  to  decrease  for  smooth¬ 
ing  delays  larger  than  7  frame  periods  since  the  queuing  delay  is  smaller  than  the 
smoothing  delay.  This  indicates  the  importance  of  choosing  the  correct  smoothing  inter¬ 
val  for  a  given  network  load  according  to  the  rules  of  Theorems  3.1  and  3.2  and  Proposi¬ 
tion  3.1  since  improperly  chosen  smoothing  policies  can  be  far  worse  than  doing  no 
smoothing  at  all. 

Finally,  Figure  3.12  depicts  the  effect  of  smoothing  delay  on  the  total  end-to-end 
delay  bound.  Network  clients  are  interested  in  achievable  end-to-end  delay  bound  and  it 
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Effect  of  Smoothing  Delay  (Three  Hops) 


Figure  3.11:  Effect  of  smoothing  delay  for  three-hops  network. 


can  be  observed  that  choosing  an  improper  smoothing  interval  can  increase  end-to-end 
delay.  For  example,  if  300  msec  end-to-end  delay  is  required,  the  only  admissible 
smoothing  delays  are  6,  7  and  8  frame  periods  at  33.2%  network  load. 


3.5.2  Real-time  Traffic 


In  order  to  evaluate  the  effect  of  smoothing  on  the  network  utilization  for  real-time 


Figure  3.12:  Effect  of  smoothing  delay  on  total  end-to-end  delay  bound. 
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Figure  3.13:  Utilization  vs  queuing  delay  for  real-time  smoothing. 


traffic,  the  same  set  of  experiments  were  conducted  using  the  real-time  smoothing  algo¬ 
rithms  presented  in  Chapter  2.  Algorithms  6  and  7  correspond  to  the  original  and  im¬ 
proved  smoothing  algorithms  proposed  for  real-time  traffic.  Algorithm  7  is  improved  in 
terms  of  number  of  rate  changes  at  the  expense  of  higher  peak  rate  and  larger  variation  in 
the  rate. 

Figure  3.13  shows  the  network  utilization  as  a  function  of  queuing  delay  bound 
using  Algorithm  6  for  various  smoothing  delays.  For  smoothing  delays  of  2  and  3  frame 
periods,  real-time  smoothing  does  not  decrease  the  queuing  delay  bound.  This  indicates 
that  although  real-time  smoothing  generates  smoother  traffic  profile  in  general,  there 
could  be  more  arrivals  within  an  interval  due  to  the  prediction  error  for  future  picture 

sizes,  or  equivalently  b(t )  >  b(t)  for  some  t .  In  Chapter  3,  it  has  been  shown  that  real¬ 
time  smoothing  algorithm  performs  the  worst  when  smoothing  delay  is  between  2  and  4 
periods  from  the  MBS  point  of  view.  However,  for  smoothing  delay  larger  than  3  periods, 
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Smoothing  Delay  (number  of  periods) 


Figure  3.14:  Smoothing  delay  vs.  Savings  in  the  end-to-end  delay  bound  for 
real-time  and  ideal  smoothing  algorithms. 

network  utilization  is  increased  indicating  that  real-time  smoothing  algorithm  generates 
smoother  traffic  according  to  Definition  3.1. 


In  Figure  3.14,  the  effect  of  smoothing  delay  on  the  savings  in  end-to-end  delay 
bound  is  depicted  for  real-time  smoothing  algorithms  6  and  7  at  a  network  with  four 
congested  hops.  The  results  indicate  that  from  network  utilization  point  of  view,  Algo¬ 


rithm  6  performs  better  than  Algorithm  7,  but  can  never  match  the  performance  of  ideal 
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Smoothing  Delay  (number  of  periods) 


Figure  3.15:  Effect  of  smoothing  delay  on  the  end-to-end  delay. 
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Figure  3.16:  Minimum  number  of  congested  hops  to  obtain  positive  savings. 

smoothing.  Even  for  large  smoothing  delays,  Algorithm  7  can  not  increase  network  utili¬ 
zation  although  it  requires  less  demanding  UPC  than  that  of  Algorithm  6.  These  results 
are  in  par  with  the  observation  made  in  Chapter  3  regarding  the  insufficiency  of  UPC  pa¬ 
rameters  to  express  the  network  resource  requirements  of  the  source.  When  compared 
with  Algorithm  6,  Algorithm  7  costs  more  to  the  network  in  terms  of  allocated  resources, 
but  costs  less  to  the  client  in  terms  of  specified  UPC.  Figure  3.15  shows  the  effect  of 
smoothing  delay  on  the  total  end-to-end  delay  bound.  It  is  clear  that  Algorithm  7  should 
not  be  used  from  network  utilization  point  of  view. 

In  Figure  3.16,  the  minimum  number  of  congested  hops  to  obtain  positive  savings 
in  the  end-to-end  delay  bound  is  presented  as  a  function  of  smoothing  delay.  At  negative 
values,  network  utilization  is  lower  with  respect  to  the  case  when  unsmoothed  sources 
are  used.  With  the  exception  of  smoothing  delay  of  3  periods,  Algorithm  6  can  always  be 
used  to  decrease  end-to-end  delay  bound  if  there  exists  a  minimum  number  of  congested 
hops  in  the  path.  For  example,  if  the  number  of  congested  hops  is  at  least  5,  then  Algo- 
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rithm  6  can  be  used  with  smoothing  delays  larger  than  4  frame  periods.  However,  with 
Algorithm  7,  it  is  almost  impossible  to  provide  such  a  general  rule  since  burstiness  of  the 
smoothed  source  is  not  bounded  by  the  smoothing  delay. 

The  results  obtained  in  this  chapter  are  valid  for  networks  with  FCFS  scheduling 
discipline  where  packets  are  served  according  to  the  first-come  first-served  policy.  Al¬ 
though  other  scheduling  disciplines,  such  as  the  Round  Robin  (RR)  or  the  Weighted 
Round  Robin  (WRR),  could  be  as  well  considered,  it  was  found  in  [79]  that  RR  and 
WRR  disciplines  are  ill  suited  for  CBR  traffic,  both  in  terms  of  performance  and  imple¬ 
mentation  complexity.  Therefore,  FCFS  scheduling  discipline  is  believed  to  be  a  reason¬ 
able  choice  in  the  context  of  research  pursued  in  this  chapter  since  piecewise  CBR  traffic 
is  generated  by  the  smoothing  function. 

Finally,  it  should  be  noted  that  the  reported  results  include  the  worst  case  measure¬ 
ments  of  queuing  delay  so  network  utilizations  presented  here  are  for  deterministic  serv¬ 
ice  only.  However,  this  does  not  prevent  the  utilization  of  unused  network  resources  for 
services  providing  statistical  or  best-effort  guarantees. 

3.6  Conclusion 

In  this  chapter,  the  effect  of  smoothing  on  the  performance  of  networks  with  deter¬ 
ministic  guarantees  was  investigated.  It  was  shown  both  analytically  and  empirically  that 
when  ideal  smoothing  is  used,  positive  savings  in  the  end-to-end  delay  bound  are  ob¬ 
tained  when  there  exists  at  least  two  congested  hops  in  the  path.  Equivalently,  the  extra 
smoothing  delay  is  equal  to  the  gain  in  queuing  delay  when  multiplexing  homogenous 
sources  at  a  congested  hop.  This  is  in  contrast  to  the  result  found  in  a  previous  study 
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where  smoothing  results  in  no  benefit  to  the  network  client  for  a  single-hop  case  when 
traffic  shaping  is  implemented  by  a  FIFO  with  a  constant  service  rate.  However,  when 
real-time  traffic  is  submitted  to  the  network,  more  congested  hops  are  required  in  order  to 
justify  the  benefit  of  smoothing.  It  was  also  found  that  current  ATM  UPC  parameter  set  is 
not  sufficient  to  express  the  actual  requirements  of  the  network  clients  from  network  cost 
point  of  view.  It  was  illustrated  that  sources  with  less  demanding  UPC  can  actually  cost 
more  to  the  network  in  terms  of  allocated  resources  than  less  bursty  sources. 

From  the  network  point  of  view,  the  benefit  obtained  by  smoothing  scheme  corre¬ 
sponds  to  higher  utilization  of  network  capacity  since  more  connections  are  admissible  at 
a  given  QoS.  This  benefit  may  be  realized  with  a  better  QoS  (via  a  reduced  delay  bound) 
or  a  better  price  of  service  (via  increasing  the  network  utilization).  The  pricing  scheme 
must  encourage  the  clients  to  smooth  their  traffic  for  maximum  benefit.  This  can  be 
provided  by  incorporating  price  into  the  smoothing  decision  when  a  QoS  is  being  re¬ 
quested  by  the  client,  which  would  presumably  encourage  some  intended  type  of  behav¬ 
ior. 
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Chapter  4 

Aggregate  Smoothing:  Integration 
of  Traffic  Shaping  and  Multiplexing 


4.1  Introduction 

With  the  deployment  of  broadband  integrated  services  based  on  the  Asynchronous 
Mode  Transfer  (ATM)  technology,  it  will  be  possible  to  provide  a  large  variety  of  serv¬ 
ices  and  distributed  applications.  It  is  likely  that  video  traffic  will  dominate  due  to  the 
high  bandwidth  requirement  of  full  motion  HDTV-quality  video  transmission.  To  reduce 
the  bandwidth  needed,  video  is  generally  compressed  by  a  video  compression  standard 
such  as  MPEG-1  and  MPEG-2  [6,  8]  before  transmission.  The  compression  method  of  a 
video  stream  can  be  either  CBR  compression  where  the  output  bit  rate  of  the  encoder  is 
forced  to  be  constant  resulting  in  variable  image  quality,  or  VBR  compression  where  the 
output  bit  rate  varies  according  to  the  requirement  of  the  underlying  video  sequence  guar¬ 
anteeing  constant  image  quality.  As  described  in  Chapter  3,  CBR  transport  of  video  is 
easier  to  manage  from  the  network  point  of  view  since  bandwidth  allocation  and  tariff  for 
network  usage  are  simple.  It  is  also  straightforward  for  the  network  to  multiplex  several 
CBR  channels  onto  a  communication  channel  since  the  cells  arrive  at  constant  rate.  How- 
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ever,  it  is  difficult  to  statistically  multiplex  VBR  video  streams  and  guarantee  lossless 
delivery  of  cells,  since  the  bit  rates  of  the  multiplexed  streams  may  peak  together. 
Lossless  cell  delivery  is  not  possible  unless  the  peak  bandwidth  is  allocated,  in  which 
case  the  delivery  of  the  VBR  stream  will  be  expensive.  For  the  case  when  several  video 
streams  are  to  be  transported  as  a  bundle  over  a  channel,  it  is  possible  to  shape  or  smooth 
each  VBR  stream  such  that  the  aggregate  bandwidth  of  the  channel  is  reduced.  To  distin¬ 
guish  this  from  individual  smoothing,  the  term  aggregate  smoothing  will  be  used.  This 
chapter  focuses  on  aggregate  smoothing  of  a  group  of  video  streams  to  be  delivered  as  a 
bundle  given  the  individual  constraints  of  each  stream  in  terms  of  delay  and  buffer 
bounds. 

Application  areas  of  aggregate  smoothing  include  video  broadcasting,  video-on- 
demand  (VoD)  and  long-distance  video-telephony  service  [11,  84],  In  the  case  of  video¬ 
broadcasting,  all  the  video  streams  originating  from  a  broadcasting  center  are  to  be  deliv¬ 
ered  to  all  receivers.  As  illustrated  in  Figure  4.1(a),  a  single  channel  from  the  broadcast¬ 
ing  center  to  the  local  fiber  node  can  be  used  to  deliver  all  the  video  streams  which  are 
then  distributed  to  the  households  in  the  neighborhood.  Figure  4.1(b)  depicts  the  VoD 
scenario  in  which  the  video  streams  are  not  all  destined  to  a  common  destination,  but 
rather  one  video  session  is  delivered  to  each  household.  This,  however,  does  not  preclude 
the  use  of  video  aggregation.  In  a  public  network,  there  is  typically  a  distribution  center 
to  which  many  subscribers  in  a  neighborhood  are  connected.  The  VoD  server  may  be  lo¬ 
cated  in  a  central  office  and  serve  an  area  covered  by  several  distribution  centers.  Video 
streams  targeted  to  the  same  distribution  node  may  be  aggregated  and  at  the  distribution 
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Figure  4.1:  Applications  of  aggregate  smoothing:  (a)  video  broadcasting  (b)  Video- 
on-Demand  (c)  Video-telephony  service. 

node,  they  are  separated  and  forwarded  to  their  respective  destinations.  Figure  4.1(c) 
shows  the  long-distance  video-telephony  service  scenario.  This  service  provides  point-to- 
point  communications  where  sources  and  receivers  are  geographically  separated.  The 
video  streams  destined  for  a  common  remote  area  can  be  aggregated  at  the  local  network 
headend  to  where  subscribers  of  the  local  region  are  connected.  In  this  scenario,  the  goal 
of  aggregation  is  to  save  expensive  long-distance  bandwidth. 
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In  Chapter  2,  smoothing  algorithms  that  minimize  the  number  of  rate  changes  were 
developed  so  that  renegotiation  cost  is  reduced  when  CBR  transport  is  used  for  VBR- 
compressed  video.  However,  for  a  single  video  session  with  real-time  traffic,  it  has  also 
been  found  that  renegotiation  cost  is  high  given  the  current  processing  capability  of  a 
typical  ATM  access  switch.  When  several  video  streams  are  to  be  transported  as  a  bundle, 
the  number  of  rate  changes  for  the  common  channel  can  be  further  reduced.  Instead  of 
smoothing  each  video  session  individually,  the  aggregate  rate  is  smoothed.  The  resulting 
traffic  profile  corresponding  to  each  video  session  is  burstier  than  the  case  when  individ¬ 
ual  smoothing  is  applied,  however,  when  smoothed  bit  streams  are  multiplexed  onto  a 
common  link,  aggregate  smoothing  results  in  less  number  of  rate  changes  and  smoother 
traffic.  Therefore,  the  bandwidth  requirements  and  the  number  of  rate  changes  are  re¬ 
duced  for  the  combined  traffic  when  aggregate  smoothing  is  used.  This  scheme  can  be 
used  by  any  video  transport  system  with  multiple  video  sources  that  can  be  multiplexed 
onto  a  common  link.  One  particular  feature  of  the  aggregate  smoothing  is  that,  individual 
buffer  and  delay  constraints  are  also  guaranteed.  To  the  author’s  knowledge,  this  is  the 
first  scheme  that  smoothes  the  aggregate  rate  and  also  satisfies  the  buffer  and  delays  con¬ 
straints  of  the  multiplexed  video  streams. 

From  the  network  utilization  viewpoint,  the  benefit  of  aggregate  smoothing  is  obvi¬ 
ous  since  the  traffic  profile  corresponding  to  aggregate  rate  is  smoother.  However,  from 
the  individual  stream  viewpoint,  the  resulting  traffic  profile  will  be,  in  general,  burstier 
compared  to  the  case  when  individual  smoothing  is  applied  to  each  video  stream  as  an 
independent  process.  This  may  affect  the  network  QoS  provided  for  that  particular 
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stream.  Therefore,  the  effect  of  aggregate  smoothing  on  the  network  delay  is  investigated 
and  it  is  found  that  significant  savings  in  end-to-end  delay  bound  can  be  achieved  when 
network  load  is  high. 

Related  work  includes  studies  performing  the  statistical  multiplexing  and  compres¬ 
sion  of  several  video  streams  in  a  related  manner.  In  [80-82],  a  smoothing  buffer  is  used 
to  collect  outputs  from  the  video  streams.  The  encoding  parameters  of  the  video  streams 
are  directly  modified  (e.g.,  the  quantization  scale)  based  on  the  CBR  bandwidth  con¬ 
straint.  The  occupancy  level  of  the  buffer  is  used  as  the  feedback  to  determine  the  amount 
of  data  the  encoders  may  output  in  the  future.  The  key  idea  is  that  when  the  buffer  level  is 
high,  the  encoders  must  encode  at  a  lower  image  quality  to  prevent  buffer  overflow;  and 
when  the  buffer  level  is  low,  the  encoders  can  encode  at  a  higher  image  quality.  In  con¬ 
trast,  constant  quality  (e.g.,  open-loop  with  no  feedback  from  the  network)  encoding  is 
assumed  which  allows  for  HDTV  quality  video  transport  without  any  loss.  Also,  one 
would  not  need  to  deal  with  the  intricate  problem  of  setting  the  appropriate  feedback  pa¬ 
rameters  during  the  system  design  in  order  to  get  reasonable  image  quality.  References 

[83]  and  [84]  do  not  use  the  smoothing-buffer  feedback  mechanism.  In  [83],  the  encoding 
parameters  of  the  video  streams  are  directly  modified  to  reduce  the  bit  rate,  whereas  in 

[84] ,  data  in  each  video  stream  is  discarded  according  to  some  signal-to-noise  or  distor¬ 
tion  metric  in  order  to  reduce  the  aggregate  bit  rate  less  than  the  reserved  bit  rate  of  the 
CBR  channel.  The  scheme  in  [84]  has  the  following  disadvantages  with  respect  to  the  ag¬ 
gregate  smoothing.  First,  it  is  not  suitable  for  no-loss,  constant  quality  video  communi¬ 
cations.  Second,  it  provides  a  mechanism  for  only  MPEG  compressed  video  streams. 
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Third,  implementation  complexity  can  be  high  for  real-time  processing  since  the  unit  of 
video  aggregation  is  a  slice  period  which  is  much  smaller  than  a  frame  period.  And  fi¬ 
nally,  it  is  not  clear  how  to  determine  CBR  bandwidth  required  by  a  group  of  aggregated 
video  streams  such  that  cell  loss  rate  can  be  bounded.  Another  work  solves  the  minimum 
reservation  rate  problem  for  multiple  pre-encoded  MPEG  video  streams  over  a  CBR 
channel  [85].  However,  their  solution  applies  to  VoD  type  of  applications  that  simply 
playback  stored  MPEG  video  and  require  very  large  buffers  at  the  receiver  for  a  two-hour 
movie.  Both  in  [84]  and  [85],  a  CBR  channel  is  used  for  the  whole  session.  On  the  other 
hand,  RCBR  scheme  allows  for  better  resource  utilization  and  provides  the  true  band¬ 
width  as  needed.  With  aggregate  smoothing,  the  number  of  rate  changes  is  significantly 
reduced  such  that  RCBR  transport  can  be  used  with  little  cost  of  rate  renegotiation. 

The  rest  of  the  chapter  is  organized  as  follows.  In  Section  4.2,  the  concept  and  de¬ 
scription  of  aggregate  smoothing  is  presented.  In  Section  4.3,  the  performance  of  aggre¬ 
gate  smoothing  is  evaluated  by  using  MPEG  traces  of  stored  and  real-time  video.  In 
Section  4.4,  the  effect  of  aggregate  smoothing  on  the  network  utilization  is  investigated 
and  it  is  shown  that  aggregate  smoothing  allows  the  network  to  provide  better  QoS  com¬ 
pared  to  the  case  when  smoothing  is  applied  to  each  video  stream  as  an  independent 
process.  Finally,  Section  4.5  concludes  this  chapter. 

4.2  Specification  of  Aggregate  Smoothing 

In  Figure  4.2,  the  system  model  for  aggregate  smoothing  is  shown.  A  video  streams 
with  constraints  (l). ,  Bi )  are  multiplexed  onto  a  link  where  D(.  denotes  the  extra 
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smoothing  delay  bound  in  terms  of  frame  period  and  Bt  denotes  the  maximum  buffer  size 
in  bits  for  stream  i.  Each  stream  has  its  own  smoothing  buffer  and  the  size  of  frame  that 
arrives  to  the  buffer  at  the  beginning  of  period  k,  belonging  to  stream  i  is  denoted  by  E\ 
and  its  smoothed  rate  at  the  buffer  output  is  denoted  by  R\.  A  simple  round-robin 
scheme  is  used  at  the  multiplexer.  The  objective  of  aggregate  smoothing  is  to  smooth  the 

multiplexer  output  given  the  set  of  constraints  (z). , Bt )  for  i  =  1,  •••  ,7V.  The  aggregate 
rate  function  at  period  k  is  defined  as  R*  =  R[  and  the  rate  vector  as 
R  l  =  [Rl  ,  R2k ,  •  •  • ,  R^  ] NxX .  In  the  following,  aggregate  smoothing  algorithm  is  described. 

Step  1 : 

Derive  the  upper  and  lower  bounds  for  the  cumulative  aggregate  rate  function.  Assume 
that  minimum  buffer  size  is  specified  as  zero  (Bemin  i  =0)  and  2?*ax .  =  B( .  The  number  of 
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frames  at  the  buffer  at  time  t  =  0  is  one.  From  Equation  (2.7),  the  upper  and  lower 
bounds  for  bit  stream  i ,  i  =  1,  •••  ,N ,  with  constraints  (p, , £, )  can  be  written  as: 


r  j-D,  i+i 


L'j  =  max) 


IX’2 \E[-Bi  for  j  >  D;  and  =  for  and  /  >  1 . 

V  k= 1  k= 1  '  *=1 


The  cumulative  of  aggregate  rate  function  can  be  then  bounded  by 


tdLil<'ZR! 


(4.1) 


k= 1  (=1 


Step  2: 

■  ^ 

Find  the  shortest  path  through  the  upper  bounds  given  by  2^,  U)  and  the  lower  bounds 

given  by  L‘j  using  the  algorithm  specified  in  Figure  2.4  to  obtain  Rk  at  the  multi¬ 
plexer  output. 

Step  3: 

For  i  =  1  ,•  •  • ,  N ,  find  the  shortest  path  through  L)  and  U)  using  the  algorithm  specified 
in  Figure  2.4  to  obtain  the  optimal  smoothing  rate  R[  for  stream  i.  Let  the  optimal  rate 
vector  be  defined  as  R[  =  [Rlk,  R%,  •••,Rk  ]Nx  i  • 

Step  4: 

At  the  final  step,  derive  Rk  from  Rk  given  the  constraint  that  X,=1  R‘k  =  Rt  •  The  closest 

vector  to  the  surface  defined  by  =  Rt  Bives  the  best  possible  suboptimal  rate 

vector  in  the  Euclidean  space.  Then,  R[  is  given  by  r^n  where  |A,  satis- 
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fies  min  Y ]N ,(rl  -  R',)  .  To  solve  for  r‘-  ,  rN  is  substituted  with  R?  -  \J 

and  the  minimum  of  ^^(r'  -  R‘k )  is  found  by  taking  its  derivative  with  respect  to  rJ 

and  solving  for  zero  values  where  the  minimum  occurs.  Then,  for  i  =  l,  ,N  -l ,  R‘k  is 
given  by 


where  A  .  is  computed  recursively  as  A  .  =  2  •  A  .  ,  -  (J  - 1)  for  j  >  2  with  the  initial  con¬ 
ditions  A0  =  1  and  A,  =  2  .  R*  is  given  by 

-IX'-  (4.3) 

j= 1 

However,  R[  as  derived  in  Equations  (4.2)  and  (4.3)  may  be  out  of  bounds  for 
some  i,  that  is  R‘k  >  U[  or  R‘k  <  L\  for  some  i .  In  Figure  4.3,  two  cases  are  illustrated. 
The  first  case  is  when  the  estimated  aggregate  rate  Rk  is  larger  than  the  sum  of  upper 
bound  of  R'k  or  smaller  than  the  sum  of  lower  bound  of  R[  as  shown  in  Figure  4.3(a). 
This  case  occurs  when  prediction  error  for  the  future  traffic  is  large  resulting  in  overesti¬ 
mation  or  underestimation  of  aggregate  bit  rate.  Therefore,  the  following  conditions  are 
checked  before  the  closest  Euclidean  vector  is  derived  in  Step  4: 

N  N 

if  Rk  <*YjL\  ,  then  choose  Rk  =  ^  L\  and  R[  =  L\ 
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if  R*  >  » then  choose  R*  =  and  R‘  =  U\ 
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If  any  of  these  conditions  is  true,  Step  4  is  skipped,  otherwise,  Rlk  is  derived  as  in  Step  4. 

Figure  4.3(b)  illustrates  the  second  case  where  the  rate  vector  found  in  Equations 
(4.2)  and  (4.3)  is  out  of  bounds  for  some  i.  In  this  case,  the  rate  vector  that  satisfies  the 
minimum  boundary  condition  is  found  by  traversing  along  the  surface  until  the  closer  up¬ 
per  or  lower  boundary  point  is  reached.  This  is  done  by  calculating  the  net  increase  or 
decrease  in  the  aggregate  rate  after  out-of-bound  rates  are  adjusted.  In  order  for  the  ag¬ 
gregate  rate  not  to  change,  other  in-bound  rates  are  also  adjusted  considering  their  re¬ 
spective  upper  and  lower  bounds. 


R 2  R 2 


Figure  4.3:  (a)  Aggregate  bit  rate  is  out  of  bounds,  (b)  Closest  rate  vector 
is  out  of  bounds. 
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5.5 


Aggregate  Rate  for  Real-time  Traffic  (D=4  periods) 


25  50  75  100  125  150  175  200  225  250  275  300 

Frame  Number 


(b) 

Figure  4.4:  (a)  Rate  function  of  total  rate  at  the  multiplexer  output. 

(b)  Rate  function  of  a  single  stream. 

The  complexity  of  aggregate  smoothing  is  derived  as  follows.  Steps  1  and  2  take 
0(H)  time  where  H  is  the  size  of  the  look-ahead  window.  Steps  3  and  Step  4  take 
N  -0(H)  time  (Step  4  takes  constant  time  since  it  is  a  matrix  multiplication  operation). 
Then  the  total  algorithm  time  is  (N  + 1)  •  0(H) .  In  the  case  of  individual  smoothing,  the 
total  computation  time  is  N  ■  0(H) .  So  the  percentage  increase  in  the  algorithm  com¬ 
plexity  is  100  x  UN.  For  large  N,  the  increase  in  algorithm  complexity  is  almost  non- 
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significant.  For  example  for  N  =  20  and  100,  the  complexity  increases  by  only  5%  and 
1%  respectively. 


4.3  Evaluation  of  Aggregate  Smoothing 

Aggregate  smoothing  scheme  is  illustrated  in  Figure  4.4.  Both  aggregate  and  indi¬ 
vidual  smoothing  algorithms  are  applied  to  a  group  of  5  real-time  video  streams  each  of 
with  smoothing  delay  of  4  frame  periods.  From  Figure  4.4(a),  it  is  observed  that  aggre¬ 
gate  smoothing  provides  smoother  rate  function  than  individual  smoothing  for  the  total 
rate  at  the  multiplexer  output.  However,  as  shown  in  Figure  4.5(b),  the  rate  function  of 
each  stream  is  burstier  for  the  case  of  aggregate  smoothing  since  the  rate  determined  by 
aggregate  smoothing  is  suboptimal. 

In  order  to  compare  the  performance  of  aggregate  smoothing  to  that  of  individual 
smoothing,  MPEG  traces  of  Formula  1  and  Star  Wars  are  used.  Each  trace  consists  of 
36,000  frame  sizes  in  bits  corresponding  to  duration  of  25-minutes  at  a  display  rate  of  24 
frames/sec.  Three  performance  measures  introduced  in  Chapter  2  are  used  which  include 
the  number  of  rate  changes  of  aggregate  rate,  the  peak  aggregate  rate  and  the  standard 
deviation  of  aggregate  rate. 

In  the  experiments,  homogenous  sources  are  assumed  where  all  video  streams  have 
different  start  times  of  the  original  stream.  This  is  realized  by  shifting  the  original 
stream’s  arrival  pattern  by  an  amount  ( i  - 1)  -36000/  N  with  the  traces  wrapped  around 
to  the  beginning  when  they  reach  the  end  for  i  =  1,  •••  ,N .  The  smoothing  delay  is  fixed 
to  4  periods  with  no  buffer  constraint  for  all  video  streams.  The  size  of  the  look-ahead 
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window  is  100  for  stored  video  and  12  for  real-time  video.  First,  the  effect  of  the  number 
of  video  sessions  on  the  aggregate  bit  rate  is  investigated.  In  Figure  4.5,  the  number  of 
rate  changes,  peak  rate  and  standard  deviation  of  rate  as  a  function  of  number  of  video 
sessions  are  presented  for  the  case  of  stored  video.  It  is  observed  that  while  aggregate 
smoothing  keeps  the  number  of  rate  changes  almost  constant,  the  number  of  rate  changes 
increases  for  the  case  of  individual  smoothing.  The  benefit  of  aggregate  smoothing  can  be 
realized  even  with  only  two  video  sessions.  In  general,  when  the  number  of  multiplexed 
sources  increases,  the  gain  from  statistical  multiplexing  also  increases.  However,  this  has 
little  effect  on  the  number  of  rate  changes  in  the  case  of  aggregate  smoothing. 

Aggregate  smoothing  results  in  higher  peak  rate  than  the  case  of  individual 
smoothing  due  to  the  nature  of  algorithm  which  tends  to  keep  constant  rate  as  far  as  the 
constraints  allow  for.  In  fact,  for  a  small  number  of  video  sessions  less  than  7,  aggregate 
smoothing  has  less  peak  rate  due  to  statistical  multiplexing  gain,  however  this  gain  is  not 
so  effective  when  the  number  of  video  sessions  is  over  7. 

As  expected,  the  standard  deviation  of  total  bit  rate  is  less  than  the  case  when  indi¬ 
vidual  smoothing  is  applied.  This  is  again  related  to  aggregate  smoothing’s  ability  to 
utilize  the  statistical  multiplexing  gain  which  results  in  smoother  rate  at  the  multiplexer 
output.  Therefore,  better  network  utilization  is  obtained  with  aggregate  smoothing. 

The  same  set  of  experiments  was  conducted  for  the  case  of  real-time  video.  Figure 
4.6  shows  the  experimental  results  for  the  three  performance  measures.  As  in  the  case  of 
stored  video,  aggregate  smoothing  reduces  the  number  of  rate  changes  dramatically,  es¬ 
pecially  when  the  number  of  video  sessions  is  large.  For  example,  the  number  of  rate 
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changes  is  decreased  by  a  factor  of  6  when  15  Star  Wars  video  sessions  are  multiplexed. 
For  a  large  number  of  video  sessions,  the  aggregate  rate  function  changes  almost  every 
period  resulting  in  very  high  renegotiation  cost  from  the  network  viewpoint.  With  aggre¬ 
gate  smoothing,  renegotiation  cost  is  significantly  reduced  and  it  is  almost  constant  inde¬ 
pendent  from  the  number  of  video  sessions.  Thus,  RCBR  transport  becomes  a  feasible 
solution  when  multiple  video  streams  are  sent  as  a  bundle.  However,  from  the  network 
utilization  viewpoint,  aggregate  smoothing  does  not  guarantee  better  performance  since 
the  standard  deviation  of  aggregate  rate  function  is  higher  for  some  cases.  In  Figure  4.6, 
aggregate  smoothing  is  outperformed  by  individual  smoothing  for  Formula  1  trace,  but  it 
performs  almost  the  same  for  Star  Wars  trace.  Both  smoothing  schemes  are  not  advanta¬ 
geous  over  another  from  the  peak  rate  viewpoint.  Based  on  the  experimental  results,  it 
can  be  concluded  that  aggregate  smoothing  for  real-time  traffic  is  beneficial  from  RCBR 
transport  viewpoint,  although  network  utilization  is  lower  than  the  case  of  individual 
smoothing.  However,  if  the  issue  is  to  choose  a  transport  service  with  deterministic  guar¬ 
antees,  then  RCBR  transport  service  is  a  better  choice  since  it  provides  the  highest  net¬ 
work  utilization  when  compared  to  other  services,  e.g.,  regular  CBR  or  VBR  transport. 
Therefore,  the  use  of  aggregate  smoothing  is  justified  from  the  network  utilization  point 
of  view  for  real-time  traffic. 

The  effect  of  smoothing  delay  on  the  performance  of  aggregate  smoothing  is  inves¬ 
tigated  by  conducting  a  second  set  of  experiments  using  stored  and  real-time  video.  In 
these  experiments,  the  number  of  video  sessions  per  multiplexer  is  fixed  to  5.  The  results 
are  similar  for  both  stored  and  real-time  video  so  only  the  experimental  results  for  stored 
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video  are  presented  in  Figure  4.7.  As  expected,  the  relative  performance  of  aggregate 
smoothing  over  individual  smoothing  decreases  with  larger  smoothing  delays.  An  inter¬ 
esting  observation  is  that  the  number  of  rate  changes  seems  to  be  independent  from  the 
statistics  of  video  sources  when  smoothing  delay  is  larger  than  4  periods  as  both  Formula 
1  and  Star  Wars  video  sequences  have  almost  the  same  number  of  rate  changes.  This  al¬ 
lows  for  approximate  estimation  of  renegotiation  cost  before  the  start  of  transmission 


given  the  number  of  video  sessions. 
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Figure  4.7:  Effect  of  smoothing  delay  on  the  performance  of 
aggregate  smoothing. 


4.4  Effect  of  Aggregate  Smoothing  on  End-to-end 
Deterministic  Guarantees 

In  Section  4.3,  aggregate  smoothing  has  been  shown  to  decrease  the  number  of  rate 
changes  and  standard  deviation  of  aggregate  rate  at  the  multiplexer  output  compared  to 
the  case  when  smoothing  is  applied  to  each  video  stream  as  an  independent  process.  As¬ 
sume  that  all  channels  in  the  network  carry  a  group  of  multiplexed  video  streams.  From 
the  network  utilization  viewpoint,  aggregate  smoothing  is  beneficial  compared  to  the  case 
of  individual  smoothing  since  the  aggregate  rate  function  at  each  multiplexer  output  is 
smoother  resulting  in  higher  network  utilization  as  described  in  Chapter  4.  However, 
from  the  individual  stream  viewpoint,  the  resulting  traffic  profile  for  that  particular 
stream  will  be  burstier.  It  is  measured  that  the  peak  rate  of  a  stream  in  the  case  of  ag¬ 
gregate  smoothing  can  be  60%  more  than  the  case  when  individual  smoothing  is  applied 
to  that  stream.  It  is  also  observed  that  the  standard  deviation  increases  by  more  than  8%. 
This  affects  the  network  QoS  as  perceived  by  a  particular  stream. 

MPEG  traces  of  Formula  1  and  Star  Wars  video  streams  each  with  10,000  frames 
are  used  in  the  experiments  in  order  to  investigate  effect  of  aggregate  smoothing  on  end- 
to-end  deterministic  delay  bound.  The  video  streams  are  first  multiplexed  onto  a  link  in  a 
round-robin  fashion  with  individual  or  aggregate  smoothing  applied  at  the  multiplexer. 
The  packets  of  each  group  of  streams  are  then  served  according  to  the  FCFS  service  dis¬ 
cipline  at  a  link  speed  of  620  Mbps.  Figure  4.8  illustrates  the  simulation  scenario.  The 
traces  at  each  round-robin  multiplexer  have  different  start  times  as  described  in  Section 
4.3,  and  wrap  around  to  the  beginning  when  they  reach  the  end  of  the  original  trace. 
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Smoothing 

buffer  Round-robin 


Therefore,  only  homogenous  sources  are  considered  both  at  the  multiplexer  and  switch 
level.  The  purpose  of  these  experiments  is  to  compare  the  performance  of  the  aggregate 
smoothing  to  that  of  individual  smoothing  from  the  network  QoS  viewpoint. 

In  the  first  set  of  experiments,  the  effect  of  network  utilization  on  end-to-end  delay 
bound  is  investigated  for  stored  video.  The  number  of  streams  per  multiplexer  is  5  and 
the  smoothing  delay  is  fixed  as  4  frame  periods  for  all  streams.  Figure  4.9  shows  the  ex¬ 
perimental  results  for  both  Formula  1  and  Star  Wars  traces.  When  the  network  is  not 
congested,  individual  smoothing  performs  better  due  to  the  smoother  traffic  it  generates 
for  each  stream.  However,  when  the  network  load  is  increased,  aggregate  smoothing  re¬ 
sults  in  smaller  delay  bound  such  that  significant  savings  in  end-to-end  delay  bound  can 
be  achieved  for  a  single  hop.  For  example,  in  Figure  4.9(a),  at  68%  network  utilization, 
savings  in  end-to-end  delay  is  110  msecs  for  the  Formula  1  trace.  In  Figure  4.9(b),  end- 
to-end  delay  bound  as  a  function  of  network  utilization  is  shown.  Aggregate  smoothing 
provides  200  msecs  of  end-to-end  delay  bound  when  the  network  load  is  68%,  whereas 
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(a) 


(b) 

Figure  4.9:  Effect  of  network  utilization  on  (a)  the  savings  in 
end-to-end  delay  bound  (b)  the  end-to-end  delay. 


individual  smoothing  can  provide  the  same  bound  only  when  the  network  load  is  62%. 
This  corresponds  to  almost  10%  increase  in  the  network  utilization. 


In  the  second  set  of  experiments,  the  effect  of  smoothing  delay  on  end-to-end  delay 
bound  is  investigated  for  a  given  network  utilization.  Figure  4.10  shows  the  experimental 
results  for  both  Formula  1  and  Star  Wars  traces.  The  results  indicate  that  savings  in  end- 
to-end  delay  bound  depend  on  the  smoothing  delay  and  there  is  an  optimal  value  of 
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Figure  4.10:  Effect  of  smoothing  delay  on  the  savings  in  end-to-end  delay. 

smoothing  delay  for  which  maximum  savings  can  be  obtained.  For  the  case  of  Formula  1 
trace  at  69.75%  network  utilization,  smoothing  delay  of  9  periods  gives  the  maximum 
savings  whereas  for  Star  Wars  trace  at  73.74%  network  utilization,  smoothing  delay  of  4 
periods  should  be  used.  For  larger  values  of  smoothing  delay,  savings  can  be  negative 
indicating  that  individual  smoothing  outperforms  aggregate  smoothing.  However,  even 
for  that  case,  amount  of  increase  in  end-to-end  delay  is  small  enough  to  justify  the  use  of 
aggregate  smoothing  for  all  network  utilization  levels. 

4.5  Conclusion 

This  chapter  has  introduced  and  described  a  new  concept  called  aggregate  smooth¬ 
ing  that  integrates  multiplexing  and  smoothing  of  multiple  video  sources  grouped  to¬ 
gether  and  transmitted  as  a  bundle.  The  novel  feature  of  aggregate  smoothing  is  its  ability 
to  satisfy  individual  requirements  of  each  video  stream  (in  terms  delay  and  buffer  con¬ 
straints)  while  smoothing  multiplexer  output.  It  has  been  shown  experimentally  that  ag¬ 
gregate  smoothing  can  reduce  the  number  of  rate  changes  significantly  such  that  RCBR 
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transport  for  real-time  traffic  can  be  efficient  for  multiple  video  sessions  grouped  to¬ 
gether.  From  the  network  viewpoint,  aggregate  smoothing  provides  smoother  traffic,  thus 
higher  network  utilization  compared  to  the  case  when  individual  smoothing  is  applied  to 
each  video  stream  as  an  independent  process.  These  benefits  are  provided  at  little  extra 
computational  cost  that  is  non-significant  when  the  number  of  video  sessions  is  over  20. 


107 


Chapter  5 

Conclusion  and  Future  Work 


The  burstiness  of  variable  bit  rate  (VBR)  traffic  makes  it  difficult  to  efficiently 
utilize  network  resources,  as  well  as  to  provide  guaranteed  end-to-end  network  quality  of 
service  (QoS)  to  the  traffic  sources.  Smoothing  or  shaping  the  traffic  at  the  entrance  of 
the  network  reduces  the  burstiness  thus  allowing  for  higher  utilization  within  the  network 
since  less  network  resource  is  required  for  the  smoothed  traffic.  In  this  report,  a  method¬ 
ology  has  been  proposed  that  provides  an  efficient  algorithm  for  smoothing  of  live  or 
stored  VBR  traffic  given  a  set  of  delay  and  buffer  constraints.  The  efficiency  of  the  pro¬ 
posed  smoothing  algorithm  has  been  demonstrated  by  integrating  it  with  a  bandwidth  al¬ 
location  scheme  that  allows  for  better  characterization  of  network  resource  requirements 
which  in  turn  results  in  higher  network  utilization  and  lower  transmission  cost.  The  rest 
of  the  chapter  is  organized  as  follows.  Section  5.1  summarizes  the  work  presented  in  this 
report.  Section  5.2  lists  the  main  contributions  of  the  dissertation.  Section  5.3  concludes 
the  chapter  with  future  work  where  a  number  of  issues  are  identified  which  can  be  pur¬ 
sued  in  the  future  using  the  framework  developed  in  this  report. 
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5.1  Overview  of  Presented  Work 

In  the  first  chapter,  an  overview  of  communications  requirements  of  multimedia 
applications  was  given  along  with  a  comparison  of  ATM  network  transport  services  that 
can  be  used  for  VBR  traffic.  Among  the  possible  services,  RCBR  service  requires  the 
least  number  of  rate  changes  in  order  to  reduce  renegotiation  cost,  whereas  VBR  transport 
service  requires  the  smoothest  traffic  (with  little  variation  in  the  rate)  for  efficient  use  of 
network  resources  to  achieve  the  maximum  statistical  multiplexing  gain  when  several 
video  sources  are  multiplexed  onto  the  same  transmission  link.  Therefore,  a  smoothing 
algorithm  that  can  utilize  the  underlying  network  services  in  a  most  efficient  way  and  also 
can  address  the  diverse  requirements  of  network  clients  should  be  provided.  The  rest  of 
the  report  describes  the  design  and  specification  of  such  a  smoothing  algorithm  and  its 
effect  on  the  network  utilization. 

In  the  second  chapter,  a  smoothing  algorithm  for  lossless  transmission  of  VBR  traf¬ 
fic  was  introduced  for  both  real-time  and  stored  data.  A  novel  feature  of  the  algorithm  is 
its  ability  to  provide  a  unique  solution  to  diverse  requirements  of  applications  expressed 
in  terms  of  delay  and  buffer  bounds.  It  has  been  shown  that  the  algorithm  is  effective  in 
smoothing  real-time  VBR  traffic  when  its  performance  is  compared  with  respect  to  other 
techniques  existing  in  the  literature  because  of  a  novel  approach  that  minimizes  the  effect 
of  future  traffic  prediction  error  on  the  rate  function  by  decoupling  past  and  future  infor¬ 
mation  from  each  other  as  far  as  possible.  This  important  contribution  has  been  demon¬ 
strated  by  conducting  a  large  number  of  experiments  using  MPEG  compressed  video  se¬ 
quences.  Even  with  a  rudimentary  forecasting  rule,  the  causal  algorithm  was  shown  to 
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be  as  effective  as  the  ideal  smoothing  scheme  where  future  traffic  is  known.  The  pro¬ 
posed  scheme  allows  to  minimize  either  the  number  of  rate  changes  or  the  standard 
variation  of  the  rate  function  depending  on  what  network  transport  service  is  available, 
e.g.,  RCBR  or  VBR  service. 

In  Chapter  3,  the  effect  of  smoothing  on  deterministic  end-to-end  delay  bound  was 
investigated.  For  the  case  of  ideal  smoothing  where  future  traffic  is  known,  it  is  shown 
both  analytically  and  empirically  that  the  extra  delay  contributed  by  smoothing  is  equal  to 
the  saving  in  queuing  delay  when  multiplexing  homogenous  sources  at  a  congested  hop. 
This  indicates  that,  with  ideal  smoothing,  it  is  possible  to  achieve  higher  network  utiliza¬ 
tion  without  any  degradation  in  the  QoS  of  the  connection  even  for  a  single  hop.  Alterna¬ 
tively,  for  multiple  congested  nodes,  smoothing  results  in  significant  reductions  in  the 
end-to-end  delay  bound  since  sum  of  the  savings  in  queuing  delay  at  each  congested  hop 
is  more  than  the  incurred  extra  smoothing  delay  at  the  source.  This  result  is  particularly 
important  since  it  proves  that  there  exists  a  smoothing  scheme  with  no  negative  effect  on 
the  network  utilization.  It  was  also  demonstrated  that  sources  with  less  demanding  UPC 
can  actually  cost  more  to  the  network  in  terms  of  allocated  resources  than  less  bursty 
sources. 

Finally,  the  fourth  chapter  introduces  and  describes  a  new  concept  called  aggregate 
smoothing  that  integrates  smoothing  and  multiplexing  of  multiple  video  sources  grouped 
together  and  transmitted  as  a  bundle.  The  novel  feature  of  aggregate  smoothing  is  that 
individual  constraints  of  each  video  session  (delay  and  buffer  bounds)  are  satisfied  while 
smoothing  the  total  rate  at  little  extra  computational  cost  that  is  non-significant  when  the 
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number  of  video  sessions  is  over  20.  It  has  been  shown  experimentally  that  aggregate 
smoothing  reduces  the  number  of  rate  changes  significantly  such  that  RCBR  transport  for 
real-time  traffic  can  be  a  cost-effective  network  service  for  multiple  video  sessions 
grouped  together.  From  the  network  viewpoint,  aggregate  smoothing  provides  smoother 
total  traffic,  thus,  higher  network  utilization  can  be  achieved  compared  to  the  case  when 
individual  smoothing  is  applied  to  each  video  stream  as  an  independent  process. 

5.2  Contributions 

The  main  contributions  of  this  report  are  summarized  below: 

1)  A  lossless  smoothing  algorithm  was  introduced  and  specified.  The  unique  fea¬ 
tures  of  the  algorithm  design  that  distinguish  it  from  other  approaches  are: 

•  The  algorithm  is  not  optimized  for  a  specific  set  of  constraints.  In¬ 
stead,  the  specification  of  constraints  is  independent  from  the  algo¬ 
rithm  design  which  allows  for  a  wide  set  of  constraints  be  specified 
and  imposed  by  the  multimedia  applications. 

•  The  algorithm  performance  can  be  optimized  for  a  specific  perform¬ 
ance  measure  which  allows  for  better  use  of  underlying  network 
services. 

•  The  effect  of  traffic  estimation  error  on  the  algorithm  performance 
was  minimized  by  a  novel  approach  which  computes  the  upper  bounds 
using  estimates  of  future  traffic  and  the  lower  bounds  using  the  his¬ 
tory.  Therefore,  accurate  estimation  of  future  traffic  is  not  as  much  re- 
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quired  to  achieve  good  performance  when  compared  with  other  ap¬ 
proaches. 


2)  It  was  proved  both  analytically  and  empirically  that  ideal  smoothing  allows  the 
network  to  support  more  connections  with  the  same  end-to-end  QoS  guarantees 
for  any  number  of  hops  in  packet-switching  networks. 

3)  A  new  concept  called  aggregate  smoothing  was  introduced  which  integrates 
multiplexing  of  VBR  sources  with  smoothing.  This  is  the  first  technique  that 
solves  the  problem  of  smoothing  aggregated  traffic  and  satisfying  the  individ¬ 
ual  constraints  of  each  source.  It  was  shown  that  aggregate  smoothing  makes  it 
possible  to  utilize  RCBR  service  for  real-time  traffic  and  at  the  same  increases 
the  network  utilization. 

5.3  Future  Work 

In  this  section,  some  research  issues  that  may  be  pursued  using  the  framework  de¬ 
veloped  in  this  report  are  identified  as  in  the  following. 

(1)  In  the  experiments,  MPEG  video  has  been  used  to  represent  VBR  traffic  due  to 
its  acceptance  as  the  standard  for  digital  compressed  video  and  since  it  is  expected  that 
future  broadcasting  and  network  services  will  use  MPEG  encoded  bit  stream  for  video 
transmission.  However,  as  stated  in  Chapter  2,  the  proposed  smoothing  algorithm  can  be 
used  with  arbitrary  traffic  that  provides  an  estimation  rule  for  the  future.  For  MPEG 
video,  a  simple  estimate  that  uses  the  size  of  picture  in  previous  GOP,  gives  good  results 
since  pictures  j  and  j-N  are  of  the  same  type  (I,  B  or  P).  An  estimation  rule  must  be 


112 


determined  for  other  traffic  by  either  observation  of  statistical  behavior  of  traffic  or  util¬ 
izing  the  traffic  structure  as  in  the  case  of  MPEG  video.  Simple  estimation  rule  for 
MPEG  video  will  not  be  sufficient  for  applications  supporting  full  VCR  functionality 
through  functions  such  as  backwards,  fast-forward  or  scanning  due  to  the  different  type  of 
traffic  generated  by  each  mode.  Therefore,  an  estimation  rule  must  be  determined  and 
passed  to  smoothing  function  so  that  future  traffic  can  be  predicted  for  best  performance. 

(2)  The  proposed  smoothing  algorithm  can  be  used  as  part  of  a  video  transport 
system  where  the  QoS  of  the  underlying  network  services  may  change  during  the  trans¬ 
mission.  For  networks  with  no  QoS  guarantees,  the  algorithm  can  be  extended  to  include 
varying  network  conditions  as  well.  Since  the  foundation  of  the  model  is  based  on  the 
status  of  buffers  at  the  server  and  client,  the  algorithm  can  adapt  to  changing  network 
conditions  by  using  the  feedback  from  the  client  to  recompute  the  bounds  based  on  in¬ 
formation  about  the  buffer  occupancies  at  the  client  and  server. 

(3)  In  Chapter  3,  only  networks  with  deterministic  guarantees  on  end-to-end  per¬ 
formance  are  considered  when  investigating  the  effect  of  smoothing  on  the  network  utili¬ 
zation.  The  study  could  be  extended  to  include  networks  that  provide  statistical  guaran¬ 
tees  in  addition  to  deterministic  guarantees  on  the  QoS  of  the  connection.  The  effect  of 
smoothing  on  network  utilization  can  be  better  understood  with  a  more  realistic  network 
modeling.  Another  area  where  the  results  of  Chapter  4  can  be  used  is  in  the  design  of 
networks  that  would  encourage  their  clients  to  smooth  their  traffic  in  order  to  achieve 
maximum  utilization  of  network  resources.  Therefore,  new  connection  admission  control 
policies  and  pricing  schemes  must  be  developed  for  better  network  resource  management. 
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OF 

ROME  LABORATORY 


Mission.  The  mission  of  Rome  Laboratory  is  to  advance  the  science  and 
technologies  of  command,  control,  communications  and  intelligence  and  to 
transition  them  into  systems  to  meet  customer  needs.  To  achieve  this, 
Rome  Lab: 

a.  Conducts  vigorous  research,  development  and  test  programs  in  all 
applicable  technologies; 

b.  Transitions  technology  to  current  and  future  systems  to  improve 
operational  capability,  readiness,  and  supportability; 

c.  Provides  a  full  range  of  technical  support  to  Air  Force  Material 
Command  product  centers  and  other  Air  Force  organizations; 

d.  Promotes  transfer  of  technology  to  the  private  sector; 

e.  Maintains  leading  edge  technological  expertise  in  the  areas  of 
surveillance,  communications,  command  and  control,  intelligence, 
reliability  science,  electro-magnetic  technology,  photonics,  signal 
processing,  and  computational  science. 

The  thrust  areas  of  technical  competence  include:  Surveillance, 
Communications,  Command  and  Control,  Intelligence,  Signal  Processing, 
Computer  Science  and  Technology,  Electromagnetic  Technology, 
Photonics  and  Reliability  Sciences. 


